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FIELD OF THE INVENTION 

The present invention relates to variants of proteases belonging to the RP-II or 
5 C-component type, and methods for the construction of such variants with altered 
properties, such as stability (e.g. thermostability or storage stability), Ca 2+ dependency, 
and pH dependent activity. 

BACKGROUND OF THE INVENTION 

10 Enzymes have been used within the detergent industry as part of washing for- 

mulations for more than 30 years. Proteases are from a commercial perspective the 
most relevant enzyme in such formulations, but other enzymes including lipases, amy- 
lases, cellulases, hemicellulases or mixtures of enzymes are also often used. Prote- 
ases are also used in other fields, such as production of diary products, processing of 

15 hides, feed processing, etc. 

To improve the cost and/or the performance of proteases there is an ongoing 
search for proteases with altered properties, such as increased activity at low tempera- 
tures, increased thermostability, increased specific activity at a given pH, altered Ca 2+ 
dependency, increased stability in the presence of other detergent ingredients (e.g. 

20 bleach, surfactants etc.), modified specificity in respect of substrates, etc. 

The search for proteases with altered properties includes both discovery of natu- 
rally occurring proteases, i.e. so called wild-type proteases but also alteration of well- 
known proteases by e.g. genetic manipulation of the nucleic acid sequence encoding 
said proteases. Knowledge of the relationship between the three-dimensional structure 

25 and the function of a protein has improved the ability to evaluate which areas of a pro- 
tein to alter to affect a specific property of the protein. 

One group of proteases, which has been indicated for use in detergents, food 
processing, feed processing is the RP-II proteases or C-component proteases belong- 
ing to the protease family S1B, glutamic-acid-specific endopeptidases. This family has 

30 till now only received relatively minor attention and has not been further grouped into 
different sub-groups. However, from the amino acid identities of isolated RP-II prote- 
ases it is evident that subgroups exist. Bacillus proteases of the RP-II type are serine 
proteases that in primary structure are similar to chymotrypsin. 
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The first description of a protease of the RP-II family of Bacillus proteases was in 
US Patent No. 4,266,031 (Tang et al M Novo Industri A/S), where it was designated 
Component C and tentatively (and incorrectly) characterised as not being a serine pro- 
tease or metallo protease. Component C was considered a contaminant in the produc- 
tion of the Bacillus licheniformis alkaline protease, subtilisin Carlsberg. 

In EP 369 817 (Omnigene Bioproducts, Inc.) the B. subtilis member of the RP-II 
family was identified by its amino acid and DNA sequences. The enzyme was again 
stated not to be a s erine p rotease, and the family name RP-II d esignated ( Residual 
Protease II). The enzyme was characterized further as a metallo protease by the inven- 
tors of EP 369 817 (Rufo et a!., 1990, J. Bacteriol. 2 1 019-1023, and Sloma et a I., 
1990, J. Bacteriol. 172 1024-1029), designating the enzyme as mpr. 

In WO 91/13553 (Novozymes A/S) the amino acid sequence of the C component 
was disclosed, stating that it is a serine protease specific for glutamic and aspartic acid, 
while EP 482 879 (Shionogi & Co. Ltd.) disclosed the enzyme and a DNA sequence 
encoding the C component from B. licheniformis ATCC No. 14580, naming the enzyme 
BLase. In EP 482 879 the protease is described as being specific for glutamic acid (see 
also Kakudo et al. "Purification, characterization, cloning, and expression of a glutamic 
acid-specific protease from Bacillus licheniformis ATCC 14580". J. Biol. Chem. 
267:23782 (1992)). 

In 1997 Okamoto et al. (Appl. Microbiol. Biotechnol. (1997) 48 27-33) found that 
the B. subtilis homologue of BLase, named BSase was identical to the above- 
mentioned enzyme, mpr/RP-ll. 

In 1999 Rebrikov et al. (Journal of Protein Chemistry, Vol. 18, No. 1, 1999) dis- 
closed a Glu-specific protease from B. intermedius that also belongs to the RP-II family. 

In WO 01/16285 a number of further RP-II protease were disclosed with DNA 
and amino acid sequences. These RP-II proteases were isolated from B. pumilus, B, 
haimapalus and B. licheniformis, WO 01/16285 also discloses a number of variants of 
RP-II proteases. These variants were based on various concepts relating to the primary 
structure of the RP-II proteases (amino acid sequences). 

The homology matrix in Table 1 below clearly indicates that the RP-II proteases 
1 to 8 are a distinct group of Glu-specific proteases that are clearly different from the 
other Glu-specific proteases in the Matrix 
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In the matrix the sequences are identfied by the patent pub ication in which first pub- 
lished or sequence database accession numbers. 

1. Bacillus sp. JA96 glutamic-acid-specific endopeptidase, JA96, WO 01/16285 
5 2. 1p3e B. Intermedins, glutamic-acid-specific endopeptidase, BIP, EMBL No. Y5136, 
Rebrikov et al., Journal of Protein Chemistry, Vol. 18, No. 1, 1999 

3. Bacillus sp. B032 glutamic-acid-specific endopeptidase, B032, WO 01/16285 

4. Bacillus licheniformis, BLC, WO 01/16285 (cf. US Patent No. 4,266,031) 

5. Bacillus sp. CDJ31 glutamic-acid-specific endopeptidase, CDJ31, WO 01/16285 
10 6. Bacillus sp. AC1 16 glutamic-acid-specific endopeptidase, AC116, WO 01/16285 

7. mpr_bacsu Bacillus subtilis serine protease, MPR, EP 369 817 

8. Bacillus sp. AA513 glutamic-acid-specific endopeptidase, AA513, WO 01/16285 

9. eta_staau Staphylococcus aureus exfoliative toxin A (Lee et al. Sequence deter- 
mination and comparison of the exfoliative toxin A and toxin B genes from Staphylo- 

1 5 coccus aureus] J. Bacterial. 1 69:3904 (1 987)) 

10. etb_staau Staphylococcus aureus exfoliative toxin B (Jackson,IVLP.; landolo,JJ.; 
Sequence of the exfoliative toxin B gene of Staphylococcus aureus; J. Bacteriol. 
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Accordingly, the object of the present invention is to provide a method for con- 
structing RP-II proteases having altered properties, in particular to provide a method for 
constructing RP-II proteases having altered properties as described above- 
Thus, in its broadest aspect, the present invention relates to a method for con- 
5 structing a variant of a parent RP-II protease, wherein the variant has at least one al- 
tered property as compared to said parent RP-II protease, which method comprises: 

i) analyzing the three-dimensional structure of the RP-II protease to identify, on the ba- 
sis of an evaluation of structural considerations, at least one amino acid residue or at 
least one structural region of the RP-II protease, which is of relevance for altering said 

10 property; 

ii) constructing a variant of the RP-II protease, which as compared to the parent RP-N 
protease, has been modified in the amino acid residue or structural part identified in i) 
so as to alter said property; and 

iii) testing the resulting RP-II protease variant for said property. 

15 Although it has been described in the following that modification of the parent 

RP-II protease in certain regions and/or positions is expected to confer a particular ef- 
fect to the thus produced RP-II protease variant, it should be noted that modification of 
the parent RP-II protease in any of such regions may also give rise to any other of the 
above-mentioned effects. For example, any of the regions and/or positions mentioned 

20 as being of particular interest with respect to, e.g., improved thermostability, may also 
give rise to, e.g., higher activity at a lower pH, an altered pH optimum, or increased 
specific activity, such as increased peptidase activity. 

Further aspects of the present invention relates to variants of a RP-II protease, 
the DNA encoding such variants and methods of preparing the variants. Still further as- 

25 pects of the present invention relates to the use of the variants for various industrial 
purposes, in particular as an additive in detergent compositions. Other aspects of the 
present invention will be apparent from the below description as well as from the ap- 
pended claims. 

30 BRIEF DESCRIPTION OF DRAWINGS 

Fig. 1 provides a schematic structure of the RP-li protease from Bacillus licheni- 
formis, BLC. 

Fig. 2 shows a 3D structure based alignment of the wild type RP-II proteases 1 
to 8 of Table 1. 
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Fig. 3 shows the BLC protease ribbon structure in black, with indication of active 
site residues, the bound peptide and the ion-binding site. The calcium ion is the sphere 
at the bottom of the Figure, the active site residues are in light grey and shown in stick 
model, and the bound peptide DAFE is in medium grey and shown in stick model. 

BRIEF DESCRIPTION OF APPENDICES 

APPENDIX 1 provides the structural coordinates for the solved crystal 3D struc- 
ture of the BLC RP-II protease, in the standard pdb format The residues are numbered 
from 1-217, the calcium ion is numbered 301, and the DAFE substrate is numbered 
401-404. 

DEFINITIONS 

Prior to discussing this invention in further detail, the following terms and con- 
ventions will first be defined. 

For a detailed description of the nomenclature of amino acids and nucleic acids 
and modifications introduced in a polypeptide or protein and especially in a RP-II prote- 
ase by genetic manipulation, we refer to WO 01/16285 pages 5 to 15, hereby incorpo- 
rated by reference. 

The term "RP-II proteases" refers to a sub-group of serine protease, belonging to 
the protease family S1B, glutamic-acid-specific endopeptidases* Serine proteases or 
serine peptidases is a subgroup of proteases characterised by having a serine in the 
active site, which forms a covalent adduct with the substrate. Further the RP-II prote- 
ases (and the serine proteases) are characterised by having two active site amino acid 
residues apart from the serine, namely a histidine and an aspartic acid residue. 

The RP-H proteases have a homology to the rest of the S1B protease family of 
around 50% (using the UWGCG version 8 software GAP program), or more preferred a 
homology higher than 55%. Table 1 demonstrate homologies between various S1B 
proteases. The RP-II proteases, nos. 1 to 8, are in Table 1 indicated in bold and the 
other S1B proteases, nos. 9 to 13, in bold italics. Table 1 shows that there is a clear 
distinction to the RP-II proteases from the other S1B proteases, but it is also clear that 
among the RP-II proteases there are subgroups. One subgroup comprises nos. 1, 2, 
and 3; and another subgroup comprises nos. 4, 5, and 6. The lengths of the listed RP-II 
proteases vary from 215 to 222 amino acid residues and experience within the subtil- 
isin subgroups of subtilases indicates that such a variation in length probably has only 

6 
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little effect on the 3-dimensional structures of these and other RP-II protease sub- 
groups, 

PARENT 

5 The term "parent" is in the context of the present invention to be understood as a 

protein, which is modified to create a protein variant. The parent protein may be a natu- 
rally occurring (wild-type) polypeptide or it may be a variant thereof prepared by any 
suitable means. For instance, the parent protein may be a variant of a naturally occur- 
ring protein which has been modified by substitution, chemical modification, deletion or 
10 truncation of one or more amino acid residues, or by addition or insertion of one or 
more amino acid residues to the amino acid sequence, of a naturally-occurring poly- 
peptide. Thus the term "parent RP-II protease" refers to a RP-II protease which is modi- 
fied to create a RP-II protease variant. 

15 VARIANT 

The term "variant" is in the context of the present invention to be understood as 
a protein which has been modified as compared to a parent protein at one or more 
amino acid residues. 

20 MODIFICATION 

The term "modification(s)" or "modified" is in the context of the present invention 
to be understood as to include chemical modification of a protein as well as genetic 
manipulation of the DNA encoding a protein. The modification(s) may be replace- 
ments) of the amino acid side chain(s), substitution(s), deletion(s) and/or insertions in 
25 or at the amino acid(s) of interest Thus the term "modified protein", e,g. "modified RP-II 
protease", is to be understood as a protein which contains modification(s) compared to 
a parent protein, e.g. RP-li protease. 

HOMOLOGY 

30 "Homology" or "homologous to" is in the context of the present invention to be 

understood in its conventional meaning and the "homology" between two amino acid 
sequences should be determined by use of the "Similarity" parameter defined by the 
GAP program from the University of Wisconsin Genetics Computer Group (UWGCG) 

7 
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package using default settings for alignment parameters, comparison matrix, gap and 
gap extension penalties. Default values for GAP penalties, i.e. GAP creation penalty of 
3,0 and GAP extension penalty of 0.1 (Program Manual for the Wisconsin Package, 
Version 8, August 1994, Genetics Computer Group, 575 Science Drive, Madison, Wis- 
5 consin, USA 53711). The method is also described in S.B. Need leman and CD. 
Wunsch, Journal of Molecular Biology, 48, 443-445 (1970). Identities can be extracted 
from the same calculation. The homology between two amino acid sequences can also 
be determined by "identity" or "similarity" using the GAP routine of the UWGCG pack- 
age version 9.1 with default setting for alignment parameters, comparison matrix, gap 

10 and gap extension penalties can also be applied using the following parameters: gap 
creation penalty = 8 and gap extension penalty = 8 and all other parameters kept at 
their default values. The output from the routine is besides the amino acid alignment 
the calculation of the "Percent Identity" and the "Similarity" between the two sequences. 
The numbers calculated using UWGCG package version 9.1 is slightly different from 

15 the version 8. 



NAMING OF RP-il PROTEASES 

In describing the RP-II proteases of the invention the following abbreviations are 
used for ease of reference: 

20 BLC = RP-II protease from Bacillus licheniformis (US Patent No. 4,266,031), 
AA513 = RP-II protease from Bacillus halmapalus AA513 (WO 01/16285), 
AC1 1 6 = RP-II protease from Bacillus licheniformis AC116 (WO 01/1 6285) 
B032 = RP-II protease from Bacillus pumilus B032 (WO 01/16285), 
CDJ31 = RP-II protease from Bacillus licheniformis CDJ31 (WO 01/16285), 

25 JA96 = RP-II protease from Bacillus pumilus JA96 (WO 01/16285), 
MPR = RP-II protease from Bacillus subtilis IS75 (EP 369 817 B1) 
BIP = RP-II protease from S. intermedins (Rebrikov et aL, Journal of Protein Chemistry, 
Vol. 18, No. 1, 1999) 



30 SEQUENCE LISTING 

In the appended Sequence Listing the RP-II proteases are indicated as: 
SEQ. ID. NO. 1 = BLC (DNA), SEQ. ID. NO. 2 = BLC (AA), 
SEQ. ID. NO. 3 = AA513 (DNA), SEQ. ID. NO. 4 = AA513 (AA), 
# SEQ. ID. NO. 5 = AC116 (DNA), SEQ. ID. NO. 6 = AC116 (AA) 

8 
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SEQ. ID. NO. 7 = B032 (DNA), SEQ. ID. NO. 8 = B032 (AA) 
SEQ. ID. NO. 9 = CDJ31 (DNA), SEQ. ID. NO. 10 = CDJ31 (AA) 
SEQ. ID. NO. 11 = JA96 (DNA), SEQ. ID. NO. 12 = JA96 (AA) 
SEQ. ID. NO. 13 = BSMPR (DNA), SEQ. ID. NO. 14 = BSMPR (AA) 
5 SEQ. ID. NO. 15 = BIP (DNA), SEQ. ID. NO. 16 = BIP (AA) 

POSITION 

The term "position" is in the context of the present invention to be understood as 
the number of an amino acid residue in a peptide, polypeptide or protein when counting 
10 from the N-terrninal end of said peptide/polypeptide. The position numbers used here 
normally refer directly to different RP-II proteases. 

The RP-II proteases are numbered individually according to each of SEQ ID NO: 
2,4, 6,8,10,12,14, and 16. 

15 Corresponding position 

The invention, however, is not limited to variants of these particular RP-II prote- 
ases but extends to parent proteases containing amino acid residues at positions which 
are "equivalent" to the particular identified residues in Bacillus licheniformis RP-II prote- 
ase. I n some preferred embodiment of the present invention, the p arent p rotease is 

20 JA96 o r B IP R P-ll p rotease a nd t he s ubstitutions are m ade a 1 1 he e quivalent a mino 
acid residue positions in JA96 or BIP corresponding to those listed above. 

A residue (amino acid) position of a RP-II protease is equivalent to a residue 
(position) of the Bacillus licheniformis RP-II protease if It is either homologous (i.e., cor- 
responding in position in either primary or tertiary structure) or analogous to a specific 

25 residue or portion of that residue in Bacillus licheniformis RP-II protease (i.e., having 
the same or similar functional capacity to combine, react, or interact chemically). 

In order to establish homology to primary structure, the amino acid sequence of 
a precursor protease Is directly compared to the Bacillus licheniformis RP-II protease, 
BLC, primary sequence by aligning the amino acid sequence of an isolated or parent 

so wild type enzyme with a suitable well-known enzyme of the same group or class of en- 
zymes defines a frame of reference. This type of numbering was used in WO 01/16285. 
If nothing else is indicated herein, in the present instance the Bacillus licheniformis RP- 
II protease, first designated component C and therefore here abbreviated BLC, has 
been chosen as standard. 

9 
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In order to establish homology to the tertiary structure (3D structure) of BLC, the 3D 
structure based alignment in Fig. 2 has been provided. By using this alignment the amino 
acid sequence of a precursor RP-II protease may be directly correlated to the Bacillus 
licheniformis RP-II protease, BLC, primary sequence. For a novel RP-II protease se- 
5 quence, the (3D based) position corresponding to a position in BLC is found by 

i) identifying the RP-II protease from the alignment of Fig. 2 that is most homolo- 
gous to the novel sequence, 

ii) aligning the novel sequence with the sequence identified to find the correspond- 
ing position in the RP-II protease from Fig. 2, and 

10 iii) establishing from Fig. 2 the corresponding position in BLC. 



For comparison and finding the most homologous sequence the GAP program from 
GCG package as described below are used. 

The alignment can as indicated above be obtained by the GAP routine of the GCG 
15 package version 8 to number the variants using the following parameters: gap creation 
penalty = 3 and gap extension penalty = 0.1 and all other parameters kept at their default 
values. 

The alignment of Fig. 2 defines a number of deletions and insertions in relation 
to the sequence of BLC. In the alignment deletions are indicated by asterixes (*) in the 

20 referenced sequence, and the referenced enzyme will be considered to have a gap at 
the position in question. Insertions are indicated by asterixes (*) in the BLC sequence, 
and the positions in the referenced enzyme are given as the position number of the last 
amino acid residue where a corresponding amino acid residue exists in the standard 
enzyme with a lower case letter appended in alphabetical order, e.g. 82a, 82b, 82c, 

25 82d, see Fig. 2. 

In case the referenced enzyme contains a N- or C-terminal extension in com- 
parison to BLC; an N-terminal extension is given the position number 0a, Ob, etc. in the 
direction of the N-terminal; and a C-terminal extension will be given either the position 
number of the C-terminal amino acid residue of BLC with a lower case letter appended 

30 in alphabetical order, or simply a continued consecutive numbering. 

Thus for comparisons RP-II proteases are numbered by reference to the posi- 
tions of the BLC RP-II protease (SEQ ID NO: 2) as provided in Fig. 2. The position is 
then indicated as "corresponding to BLC". 

35 DETAILED DESCRIPTION OF THE INVENTION 
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The inventors of the present invention have elucidated the three-dimensional 
structure of BLC, SEQ ID NO:2 by X-ray crystallography and found that there are sev- 
eral interesting features in the structure of this protease in comparison with the known 
structures of other proteases, such as the RP-II proteases. These features include both 
5 similarities and differences. 



RP-II proteases 

As described above a RP-II protease is in the context of the present invention to 
be understood as a protease which has at least 50% homology to BLC (SEQ ID NO:2). 
10 In particular said protease may have at least 55% homology to BLC, i.e. to SEQ ID 
NO:2. The invention thus relates to variant RP-II proteases having at least 50% homol- 
ogy to BLC. 

Specifically the variants of the invention may comprise RP-II proteases compris- 
ing a number of modifications or modifications in a number of positions ranging from at 

15 least one and up to 50, or from 1 to 45, or from 1 to 40, or from 1 to 35, or from 1 to 30, 
or from 1 to 25, or from 1 to 20, or from 1 to 1 5, or from 1 to 14, or from 1 to 1 3, or from 
1 to 12, or from 1 to 1 1 , or from 1 to 10, or from 1 to 9, or from 1 to 8, or from 1 to 7, or 
from 1 to 6, or from 1 to 5, or from 1 to 4, or from 1 to 3, or from 1 to 2 modifications or 
positions. Such modifications comprising substitutions, deletions and insertions in the 

20 indicated number or number of positions. 

A RP-II protease variant of the present invention is encoded by an isolated 
polynucleotide, which nucleic acid sequence has at least 50% homology with the nu- 
cleic acid sequence shown in SEQ ID NO: 1, and where the polynucleotide encodes a 
variant RP-II protease in relation to a parent protease. 

25 In a first embodiment of the present invention a RP-II protease suitable for the 

purpose described herein may be a RP-II protease homologous to the three- 
dimensional structure of BLC, i.e. it may be homologous to the three-dimensional struc- 
ture d efined by the structure coordinates in Appendix 1 by comprising the structural 
elements defined below. 

30 It is well-known to a person skilled in the art that a set of structure coordinates 

for a protein or a portion thereof is a relative set of points that define a shape in three 
dimensions; it is possible that an entirely different set of coordinates defines an identi- 
cal or a similar shape. Moreover, slight variations in the individual coordinates may 
have little or no effect on the overall shape. 
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These variations in coordinates may be generated because of mathematical 
manipulations of the structure coordinates. For example, the structure coordinates of 
Appendix 1 (BLC structure) may be manipulated by crystallographic permutations of 
the structure coordinates, fractionalization of the structure coordinates, integer addi- 
tions or subtractions to sets of the structure coordinates, inversion of the structure co- 
ordinates or any combination of the above. Alternatively, said variations may be due to 
differences in the primary amino acid sequence. 

When such variations are within an acceptable standard error as compared to 
the structure coordinates of Appendix 1 said three-dimensional structure is within the 
context of the present invention to be understood as being homologous to the structure 
of Appendix 1. The standard error may typically be measured as the root mean square 
deviation of e.g. conserved backbone residues, where the term "root mean square de- 
viation" (RMS) means the square root of the arithmetic mean of the squares of the de- 
viations from the mean. 

It is also well-known to a person skilled in the art that within a group of proteins 
which have a homologous structure there may be variations in the three-dimensional 
structure in certain areas or domains of the structure, e.g. loops, which are not, or at 
least only of a small importance to the functional domains of the structure, but which 
may result in a big root mean square deviation of the consented residue backbone at- 
oms between said structures. 

Thus it is well known that a set of structure coordinates is unique to the crystal- 
lised protein. No other three dimensional structure will have the exact same set of co- 
ordinates, be it a homologous structure or even the same protein crystallised in differ- 
ent manner. There are natural fluctuations in the coordinates. The overall structure and 
the inter-atomic relationship can be found to be similar- The similarity can be discussed 
in terms of root mean square deviation of each atom of a structure from each "homolo- 
gous" atom of another structure. However, only identical proteins have the exact same 
number of atoms. Therefore, proteins having a similarity below 100% will often have a 
different number of atoms, and thus the root mean square deviation can not be calcu- 
lated on all atoms, but only the ones that are considered "homologous". A precise de- 
scription of the similarity based on the coordinates is thus difficult to describe and diffi- 
cult to compute for homologous proteins. Regarding the present invention, similarities 
in 3D structure of different RP-II proteases can be described by the content of homolo- 
gous structural elements, and/or the similarity in amino acid or DNA sequence 

Examples of BLC like RP-II proteases include the BLC = RP-II protease from 

12 
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Bacillus licheniformis (cf. US Patent No. 4,266,031), AA513 = RP-II protease from Ba- 
cillus halmapa/us AA513 (NP000368), AC116 = RP-il protease from Bacillus licheni- 
formis AC116 (NP000364), B032 = RP-II protease from Bacillus pumilus B032 
(NP000366), CDJ31 = RP-II protease from Bacillus licheniformis CDJ31 (NP000365), 
5 JA96 = RP-II protease from Bacillus pumilus JA96 (NP000367), MPR = RP-II protease 
from Bacillus subtilis IS75 (cf. EP 369 817 B1), BIP = RP-II protease from R interme- 
dius (EMBL No. Y5136, Rebrikov et al., Journal of Protein Chemistry, Vol. 18, No. 1, 
1999) 

Accordingly, a preferred embodiment of the present invention is a variant of a 
10 parent RP-II protease or a RP-II protease variant which is at least 50% homologous to 
the sequence of SEQ ID NO 2 preferably at least 55%, preferably at least 65%, at least 
70%, at least 74%, at least 80%, at least 83%, at least 90%, at least 91%, at least 92%, 
at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at 
least 99% homologous to the sequence of SEQ ID NO:2, 4, 6, 8, 10, 12, 14 or 16, 
15 A further embodiment of the invention is a RP-II protease variant comprising the 

following structural characteristics: 

a) two beta-barrel domains each comprising six long strands in antiparallel organi- 
sation, 

b) three alpha helices, 

20 c) at least one ion-binding site, 

d) an active site comprising the amino acid residues His, Asp and Ser. 

The potential ion binding site is defined as similar coordination or arrangement 
of the coordinates as in the 3D structure of BLC having one calcium ion coordinated by 

25 the lie 3 carbonyl atom O, the Ser 5 carbonyl atom O and bidendate by the Asp 161 
Carboxyl acid group and the further coordination made by waters. The calcium may be 
substituted in the structure by water but then having the same coordination. 

The RP-II protease variants of the present invention are encoded by isolated 
polynucleotides, which nucleic acid sequence has at least 45%, at least 50%, at least 

30 55%, at least 60%. at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, 
at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at 
least 96%, at least 97%, at least 98%, or at least 99% homology with the nucleic acid 
sequence shown in SEQ ID NO:1, 3, 5, 7 ,9, 11, 13, or 15, and where the polynucleo- 
tide encodes a variant RP-II protease in relation to a parent protease. 

35 Further the isolated nucleic acid sequence encoding a RP-II protease variant of 
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the invention hybridizes with a complementary strand of the nucleic acid sequence 
shown in SEQ ID NO: 1 preferably under low stringency conditions, at least under me- 
dium stringency conditions, at least under medium/high stringency conditions, at least 
under high stringency conditions, at least under very high stringency conditions. 
5 Suitable experimental conditions for determining hybridization at low, medium, or 

high stringency between a nucleotide probe and a homologous DNA or RNA sequence 
involves presoaking of the filter containing the DNA fragments or RNA to hybridize in 5 
x SSC (Sodium chloride/Sodium citrate, Sambrook et ai. 1989) for 10 rnin, and prehy- 
bridization of the filter in a solution of 5 x SSC, 5 x Denhardt's solution (Sambrook et al. 

10 1989), 0.5 % SDS and 100 \sg/m\ of denatured sonicated salmon sperm DNA (Sam- 
brook et aL 1989), followed by hybridization in the same solution containing a concen- 
tration of 10ng/ml of a random-primed (Feinberg, A. P. and Vogelstein, B. (1983) AnaL 
Biochem. 132:6-13), 32 P-dCTP-labeled (specific activity > 1 x 10 9 cpm/pg) probe for 12 
hours at ca. 45°C. The filter is then washed twice for 30 minutes in 2 x SSC, 0.5 % 

15 SDS at least 55°C (low stringency), more preferably at least 60°C (medium stringency), 
still more preferably at least 65°C (medium/high stringency), even more preferably at 
least 70°C (high stringency), and even more preferably at least 75°C (very high strin- 
gency). 

20 Three-dimensional structure of RP-II proteases 

The BLC RP-II protease was used to elucidate the three-dimensional structure 
forming the basis for the present invention. 

The structure of BLC was solved in accordance with the principle for x-ray 
crystallographic methods, for example, as given in X-Ray Structure Determination, 
25 Stout, G.K. and Jensen, L.H., John Wiley & Sons, Inc. NY, 1989. 

The structural coordinates for the solved crystal structure of BLC are given in 
standard PDB format (Protein Data Bank, Brookhaven National Laboratory, Brook- 
haven, CT) as set forth in Appendix 1 . It is to be understood that Appendix 1 forms part 
of the present application. In the context of Appendix 1, the following abbreviations are 
30 used: CA refers to c-alpha (carbon atoms) or to calcium ions, (however to avoid misun- 
derstandings we normally use the full names "c-alpha atoms", "calcium" u Ca" or "ion" in 
the present specification). Amino acid residues are given in their standard three-letter 
code or the standard one-letter code. The structural coordinates in Appendix 1 contain 
the protease structure wherein the active serine was replaced by alanine and a com- 
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plex formed with the peptide DAFE (= Asp-Ala-Phe-Glu) as well as water molecules. 
The protease coordinates has a chain identification called A, whereas the peptide is 
called B, the calcium ion is called C, and the water is W. In the following the positions 
of the mentioned residues refer to the sequence of BLC as disclosed in SEQ ID NO: 2. 
5 The overall structure of BLC falls into the S1 group of the proteases (MEROPS; 

http://merops.sanger.ac.uk/). The structure is a trypsin type of fold with two beta-barrel 
domains. The beta-barrel's each consists of six antiparallel beta-sheets folded into a 
beta-barrel. The topology can be described as S1-S2-S3-S6-S5-S4 for the strands in 
both beta-barrels. It is assumed that all the RP-II proteases fall within the same gen- 

10 eral overall structure. 

The 3D structure of C-component serine protease from Bacillus licheniformis has 
16 strands of which the 12 bigger strands compose the two beta-barrels; and 3 helixes. 
The four very short strands are number 1 , 5, 6 and 10 counting from the N-terminal and 
are composed of residue numbers 9-10, 50-51, 56-57 and 114-115. The other strands 

15 are residue numbers 22-26, 31-36, 41-44, 62-65, 77-83, 99-102, 126-131, 142-151, 
156-159, 171-177, 182-192 and 201-205. One main helix C-terminal residue number 
208-219. Two very small helices are composed of residues 86-90 and 106-110. 

The active site consists of a triad involving the Ser in position 167, the His in po- 
sition 47, and the Asp in position 96. 

20 The 3D structure of BLC has one calcium ion coordinated by the carbonyl oxy- 

gen atom of lie in position 3, the carbonyl oxygen atom of Ser in position 5, and biden- 
date by the Carboxylic acid group of Asp in position 161. Further coordinations are 
made by water molecules. 

The calcium ion is placed in a distance from the CA atoms of the active site and 

25 Gly in position 168 as provided below: 
Ser 1 67 CA atom to Ca ion: 1 6.07A 
His 47 CA atom to Ca ion: 24.27A 
Asp 96 CA atom to Ca ion: 23.72A 
Gly 168 CA atom to Ca ion: 19.20A 

30 

The position of an ion-binding site can be defined by the distance to four specific 
atoms in the core structure. The distance from the ion-binding site to the c-alpha atoms 
of the three active site residues has been chosen. Throughout the RP-II proteases the 
residues Ser, His and Asp in the active site are highly conserved. In BLC they are 
35 Asp96, His47 and Ser167. The fourth distance chosen is the distance to the c-alpha 
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atom of the amino acid residue coming first after the active site serine residue in the 
sequence (herein after called "next to Ser"); in the 3D structure of BLC it is Gly168. 

In a preferred embodiment of the present invention, the distance between the 
ion-binding site and i) Asp c-alpha atom is 22.50-24.00 A, ii) His c-alpha atom is 23.25- 
5 25,25 A, iii) Ser c-aipha atom is 1 5.00-1 7.00A, iv) next to Ser c-alpha atom is 18.20- 
20.20 A, 

However these distances may vary from one RP-II protease to the other, and as 
described above, the ion binding site may also bind to a sodium ion. The present dis- 
tances are given with a calcium ion in the structure. If a sodium ion was bound instead 
10 the distances would be shifted a little bit Generally the distances can vary ±0.8A, pref- 
erably +0.7A, +0.6A, ±0.5A, ±0.4A, or most preferably +0.3A. 

Further, in the RP-II proteases, the peptide structure circumscribing the ion- 
binding site is composed of the amino acid residues placed in positions 1-7, 159-162 
and 143-145 with the coordinating atoms being the backbone carbonyl oxygen atom of 
15 residues 13, S5, D161 and water molecules. 

3D structures of RP-II proteases can be modelled using the known structure of a 
related protease and general modelling tools as shown in Example 1 . A prerequisite for 
obtaining a realistic 3D model structure is that the model is based on an adequate se- 
quence homology higher than 50%, preferably higher than 55%, and even more pre- 
20 ferred higher than 60% to the sequence of the protease for which the structure is 
known. RP-II Protease models can be constructed based on the 3D guided sequence 
alignments to BLC in Figure 2. 

Therefore 3D structure models of RP-II proteases could in principle be made by 
using the modelling tools and the known 3D structure of the toxin A protease from 
25 Staphylococcus a ureus f rom t he E xf family of p roteases ( Cavarelli e t a I. ( 1 997) The 
Structure of Staphylococcus aureus Epidermolytic Toxin A, an atypic serine protease, 
at 1.7 A resolution, Structure, Vol. 5, p.813 (pdb name 1ARP). 

If compared to the structure of the toxin A protease from Staphylococcus aureus, 
the structure of the RP-II proteases, as represented by BLC, can be divided into a 
30 "common protease" region, an "intermediate" region and a "nonhomologous" region. 

The active site can be found in the common protease region, which is structur- 
ally closely related to the Toxin A structure. The common protease region is composed 
of residues 58, 70-83. The common protease region has an RMS lower than 1 .2. 

Outside the common protease region the structure of the RP-II protease BLC 
35 differs from the Toxin A structure to a greater extent. 
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The intermediate region consists of residues 14-28, 29-51, 94-104, 155-175. 
The intermediate region has an RMS bigger than 1.2 and less than 1.8. Any relation- 
ships between the three-dimensional structure and functionality based on modelling 
from the S, aureus 3D structure are potentially difficult to predict in this region of the 
5 RP-II proteases. 

The common region and the intermediate region consist of the majority of the 
two central beta-barrels, especially the strands of the beta-barrels. 

The nonhomologous region consists of residues 1-6, 7-13, 52-57, 59-69, 84-88, 
89-93, 105-153. The nonhomologous region has a RMS higher than 1.5. Any relation- 
10 ships between the three-dimensional structure and functionality based on modelling 
from the S. aureus 3D structure are very difficult to predict in this region of the RP-II 
proteases. 

Inferred structure-function relationships based on model building of a RP-II pro- 
tease 3D structure on the 3D structure of S. aureus Toxin A would thus be very uncer- 
15 tain and speculative. 

Homology building of RP-II proteases 

A model structure of a RP-II protease can be built using the BLC structure in Ap- 
pendix 1, or a structure similar to the BLC structure comprising the structural elements 

20 (a) two beta-barrel domains each comprising six long strands in antiparallel organisa- 
tion, (b) three alpha helices, (c) at least one low affinity ion-binding site, and (d) an ac- 
tive site comprising the amino acid residues His, Asp and Ser, or other 3D RP-II prote- 
ase structures, e,g, established by X-ray structure determination, that may become 
available in the future, and the Homology™ program or a comparable program, e.g., 

25 Modeller™ (both from Molecular Simulations, Inc., San Diego, CA). The principle is to 
align the amino acid sequence of a protein for which the 3D structure is known with the 
amino acid sequence of a protein for which a model 3D structure has to be constructed. 
The structurally conserved regions can then be built on the basis of consensus se- 
quences. In areas lacking homology, loop structures can be inserted, or sequences can 

30 be deleted with subsequent bonding of the necessary residues using, e.g., the program 
Homology. Subsequent relaxation and optimization of the structure should be done us- 
ing either Homology or another molecular simulation program, e.g., CHARMm™ from 
Molecular Simulations. 
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Methods for designing BLC and RP-II or S1B family protease variants 

Comparisons of the molecular dynamics of different proteins can give a hint as 
to which domains are important or connected to certain properties pertained by each 
protein. 

The present invention comprises a method of producing a variant of a parent 
BLC like RP-II protease, the variant having at least one altered property as compared 
to the parent BLC like RP-II protease, the method comprising: 

a) producing a model structure of the parent BLC like RP-II protease on the 
three-dimensional structure of BLC, 

b) comparing the model three-dimensional structure of the parent BLC like RP-II 
protease to the BLC structure by superimposing the structures through 
matching the active residues CA, CB, C, O, and N atoms, 

c) identifying on the basis of the comparison in step a) at least one structural 
part of the parent BLC like RP-II protease, wherein an alteration in said struc- 
tural part is predicted to result in an altered property; 

d) modifying the nucleic acid sequence encoding the parent BLC like RP-II pro- 
tease to produce a nucleic acid sequence encoding deletion or substitution of 
one or more amino acids at a position corresponding to said structural part, 
or an insertion of one or more amino acid residues in positions corresponding 
to said structural part; 

e) expressing the modified nucleic acid sequence in a host cell to produce the 
variant RP-II protease; 

f) isolating the produced protease; 

g) purifying the isolated protease and 

h) recovering the purified RP-II protease. 



Stability - alteration of ion-binding site 

An ion-binding site is a significant feature of an enzyme. Therefore alterations of 
the amino acid residues close to the ion-binding site are likely to result in alterations of 
the stability of the enzyme. Especially modifications affecting the charge distribution 
and/or the electrostatic field strength at or in the vicinity of the site are important. 



Improved stability 



18 



10517.000-DK 



Stabilisation of the ion-binding site of RP-II proteases may be obtained by modi- 
fications in positions close to the ion binding site. 

Such modifications may comprise the substitution of a positively charged amino 
acid residue with a neutral or negatively charged residue, or the substitution of a neu- 
5 tral residue with a negatively charged residue or the deletion of a positively charged or 
neutral residue in positions close to the ion binding site. 

Positions located at a distance of 10A or less to the ion-binding site of BLC are: 
1, 2, 3, 4, 5, 6, 7, 8, 143, 144, 145, 146, 158, 159, 160, 161, 162, 194, 199, 200, and 
201. Especially positions 2, 3, 4, 5, 6, 7, 144, 159, 160, 161 located at a distance of 6 A 
10 or less from the ion binding site are important. 

Corresponding positions in other RP-II proteases may be identified using Fig. 2 

herein. 

The modifications D7E and D7Q in BLC are examples of suitable modifications 
in one of these positions. 

15 

Removal of ion-binding site in BLC 

By removing the ion-binding site it is possible to alter the enzymes dependency 
of calcium or other ions in the solution. 

Removal of the Calcium site in BLC can be done by the substitutions H144R 
20 and/or D161R,K+H144Q,N (SEQ ID NO: 2). Similar modifications may be made in 
structurally corresponding residues in other RP-II proteases. 



Alteration of thermostability 

A variant with improved stability (typically increased thermostability) may be ob- 
25 tained by modification of the mobility of identified regions, such as by introduction of di- 
sulfide bond(s), substitution with proline, alteration of hydrogen bond contact(s), alter- 
ing charge distribution, introduction *of salt bridge(s), filling in internal structural cavities 
with one or more amino acids with bulkier side groups (in e.g. regions which are struc- 
turally mobile), substitution of histidine residues with other amino acids, removal of a 
30 deamidation sites, or by helix capping, 



Regions with increased mobility: 

The b elow i ndicated regions of B LC h ave a n i ncreased m obility i n t he c rystal 
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structure of the enzyme, and it is presently believed that these regions can be respon- 
sible for stability or activity of BLC and the other RP-II proteases. Especially thermosta- 
bilisation may be obtained by altering the highly mobile regions. Generally, thermosta- 
bility may be improved by making these regions less mobile. Improvements of the en- 
zyme may be obtained by making modifications in the regions and positions identified 
below. Introducing e.g. larger residues or residues having more atoms in the side chain 
could increase the stability, or, e.g., introduction of residues having fewer atoms in the 
side chain could be important for the mobility and thus the activity profile of the en- 
zyme. The regions can be found by analysing the B-factors taken from the coordinate 
file in Appendix 1, and/or from molecular dynamics calculations of the isotropic fluctua- 
tions. These can be obtained by using the program CHARM m from MSI (Molecular 
Simulations Inc.). 

Molecular dynamics simulation at 300K and 400K of BLC reveals the following 
highly mobile regions: 

26-31, 50-55, 89-91, and 193-198, and 4-5, 11-12, 26-31, 50-55, 69-70, 89-91, 178- 
183, 195-199 and 216-221, respectively. 

It is contemplated that modifications in these regions may influence the 
thermostability of RP-II proteases. Modifications are preferably made in the regions 26- 
31 (26, 27, 28, 29, 30, 31); 89-91 (89, 90, 91); 216-221 (216, 217, 218, 219, 220, 221), 
and especially in BLC the substitutions G30A and G91A. Similar modifications may be 
made in structurally corresponding residues in other RP-II proteases. 

Also B-factors (see "in X-Ray Structure Determination, Stout, G.K. and Jensen, 
L.H., John Wiley & Sons, Inc. NY, 1989") from crystallographic data indicate the follow- 
ing more mobile regions in the BLC (RP-II protease) structure: 
51-56, (i.e. 51, 52, 53, 54, 55, 56) 
88-94, (i.e. 88, 89, 90, 91, 92, 93, 94) 
118-122 (I.e. 118, 119, 120, 121, 122) 

173-183 (i.e. 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183) 

It is contemplated that modifications in these regions may influence the thermo- 
stability of RP-II proteases. Modifications are preferably made in the regions 51-56 and 
118-122. 

Disulfide bonds: 

A RP-II protease variant of the present invention with improved stability, e.g. 
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thermostability, as compared to the parent RP-II protease may be obtained by introduc- 
ing new inter-domain or intra-domain bonds to provide a more rigid and stable struc- 
ture, such as by establishing inter- or intra-domain disulfide bridges. This is done by in- 
troducing cysteines in appropriate positions in the RP-II molecule by substitution(s) or 
5 insertion(s). 

According to the guidelines mentioned above the below mentioned amino acid 
residues identified in the amino acid sequence of SEQ ID NO: 2 are contemplated as 
being suitable for cysteine replacement. With one or more of these substitutions with 
cysteine, disulfide bridges may form in a variant of BLC. A stabilising disulfide bridge 
10 may be constructed through the substitutions: S145C and T128C 

Surface charge distribution 

A variant with improved stability (typically improved thermostability or storage 
stability) as compared to the parent RP-II protease may be obtained by changing the 

15 surface charge distribution of the RP-II protease. For example, when the pH is lowered 
to about 5 or below, histidine residues typically become positively charged and, conse- 
quently, unfavorable electrostatic interactions on the protein surface may occur. By en- 
gineering the surface charge of the RP-II protease one may avoid such unfavorable 
electrostatic interactions that in turn may lead to a higher stability of the RP-II protease. 

20 Charged amino acid residues are (a) positively charged: Lys, Arg, His (pH<5), 

Tyr (pH>9) and Cys (pH>10??) and (b) negatively charged: Asp and Glu. 

The surface charge distribution may be modified by (a) removing charged resi- 
dues from the surface through deletion of a charged residue or substituting an un- 
charged residue for a charged residue, (b) adding charged residues to the surface 

25 through insertion of a charged residue or substituting a charged residue for an un- 
charged residue, or (c) by reverting the charge at a residue through substituting a posi- 
tively charged residue for a negatively charged residue or substituting a negatively 
charged residue for a positively charged residue. 

Therefore, a further aspect of the present invention relates to a method for con- 

30 structing a variant of a parent RP-II protease having a modified surface charge distribu- 
tion, the method comprising: 

a) identifying, on the surface of the parent RP-II protease, at least one charged 
amino acid residue; 

b) modifying the charged residue identified in step (a) through deletion or substitu- 



21 



10517,000-DK 



tion with an uncharged amino acid residue; 

c) optionally repeating steps a) and b) recursively; 

d) preparing the variant resulting from steps a) - c); 

e) testing the stability of said variant; and 

f) optionally repeating steps a) - e) recursively; and 

g) selecting a RP-II protease variant having increased stability as compared to the 
parent RP-II protease. 



As will be understood by the skilled person it may also, in some cases, be 
10 advantageous to substitute an uncharged amino acid residue with an amino acid 
residue bearing a charge or, alternatively, it may in some cases be advantageous to 
substitute an amino acid residue bearing a charge with an amino acid residue bearing a 
charge of opposite sign. Thus, the above-mentioned method may be employed by the 
skilled person also for these purposes. In the case of substituting an uncharged amino 
15 acid residue with an amino acid residue bearing a charge the above-mentioned method 
may be employed the only difference being steps a) and b) which will then read: 

a) identifying, on the surface of the parent RP-II protease, at least one position be- 
ing occupied by an uncharged amino acid residue; 

b) modifying the charge in that position by substituting the uncharged amino acid 
20 residue with a charged amino acid residue or by insertion of a charged amino 

acid residue at the position. 



Also in the case of changing the sign of an amino acid residue present on the 
surface of the RP-II protease the above method may be employed. Again, compared to 
25 the above method, the only difference being steps a) and b) which, in this case, read: 

a) identifying, on the surface of the parent RP-II protease, at least one charged 
amino acid residue; 

b) substituting the charged amino acid residue identified in step (a) with an amino 
acid residue having an opposite charge. 

30 

In order to determine the amino acid residues of a protease, which are present 
on the surface of the enzyme, the surface accessible area are measured using the 
DSSP program (Kabsch and Sander, Biopolymers (1983), 22, 2577-2637). All residues 
having a surface accessibility higher than 0, 0.10, 0.20, 0.30, 0.35, 0.40, 0.45, 0.50, 
35 0.55 or 0.60 are regarded a surface residue. 
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An amino acid residue found on the surface of BLC using the above method is 
T109 and it is contemplated that the substitutions T109R, K, H are of particular interest. 

Similar substitutions may be introduced in equivalent positions of other RP-ll 
proteases. 

5 For the purpose of providing RP-ll protease variants exhibiting improved wash 

performance it is possible to modify the pi of the RP-ll protease through modification of 

the surface charge as indicated in WO 91/00345 (Novozymes A/S) and/or WO 

99/20771 (Genencor International, Inc.) 

Especially changing the pi of the RP-ll protease is of interest 
10 Changes in BLC: 

T109R, K, H 

Q143R, K, H 

E209Q, N 

D7N, S, T 
15 Q174R, K, H 

N216R, K, H 

Y17R, K, H 

Y95R, K, H 

Corresponding modifications may be performed in corresponding positions of 
20 other RP-ll proteases. 



Substitution with proline residues 

Improved thermostability of a RP-ll protease can be obtained by subjecting the 
RP-ll protease in question to analysis for secondary structure, identifying residues in the 

25 RP-ll protease having dihedral angles <j> (phi) and y (psi) confined to the intervals [- 
90°<<j><-40° and -180°<y<180°] f preferably the intervals [-90°<<j><-40° and 120°<\|/<180 o ] 
or [-90 0 <(j><-40 0 and -50°<\^<10°] and excluding residues located in regions in which the 
RP-ll protease is characterized by possessing a-helical or p-sheet structure. 

After the dihedral angles § (phi) and vj; (psi) for the amino acids have been calcu- 

30 lated, based on the atomic structure in the crystalline RP-ll proteases, it is possible to se- 
lect position(s) which has/have dihedral phi and psi angles favourable for substitution with 
a proline residue. The aliphatic side chain of proline residues is bonded covalently to the 
nitrogen atom of the peptide group. The resulting cyclic five-membered ring consequently 
imposes a rigid constraint on the rotation about the N-C a bond of the peptide backbone 
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and simultaneously prevents the formation of hydrogen bonding to the backbone N-atom. 

For these structural reasons, proline residues are generally not compatible with a- 
helical and p-sheet secondary conformations. 

If a proline residue is not already at the identified position(s), the naturally occurring 
5 amino acid residue is substituted with a proline residue, preferably by site directed 
mutagenesis applied on a gene encoding the RP-II protease in question. 
In the group of BLC- like proteases proline residues can be introduced at positions 18, 
1 15, 185, 269 and 293. Accordingly, a preferred BLC variant has one or more of the sub- 
stitutions: T60P, S221P, G193P, and V194P. 

10 

Alteration of activity: 

Amino acid residues at a distance of less than 10A from the active site residues 
are most likely to influence the specificity and activity of the RP-II proteases, therefore 
variants comprising modifications in positions 1, 8, 22-35 (22, 23, 24, 25, 26, 27, 28, 

15 29, 30, 31, 32, 33, 34, 35), 42-58 (42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 
56, 57, 58), 82-100 (82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
99, 100), 129-135 (1129, 130, 131,132, 133, 134, 135), 141-142, 153-156 (153, 154, 
155, 156), 158, 161-171 (161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171), 188- 
193 (188, 189, 190, 191, 192, 193), 195,, 201-207 (201, 202, 203, 204, 205, 206, 207), 

20 210, 213-214, 217 may provide a change in activity and/or specificity of the RP-II pro- 
tease variant. 



Substrate binding site 

The substrate binding site is identified by the residues in contact with a substrate 
25 model, such as the DAFE. The 3D structure coordinates of the BLC protease with 

DAFE bound in the active site can be found in Appendix 1. Without being limited to any 
theory, it is presently believed that binding between a substrate and an enzyme is sup- 
ported by favorable interactions found within a sphere 10 A from the substrate mole- 
cule, in particular within a sphere of 6 A from the substrate molecule. Examples of such 
30 favorable bonds are hydrogen bonds, strong electrostatic interaction and/or hydropho- 
bic interactions. 

The following residues of the BLC protease (SEQ ID NO:1), are within a distance 
of 10A from the peptide DAFE and thus believed to be involved in interactions with said 
substrate: 1, 2, 3, 8, 25, 29, 30, 31, 32, 33, 34, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 
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90, 91, 92, 93, 94, 95, 96, 97, 129, 131, 132, 133, 134, 135, 155, 157, 158, 159, 160, 
161, 162, 163, 164, 165, 166, 167, 168, 169, 171, 189, 190, 191, 192, 193, 194, 195, 
196, 197, 198, 200 and 204. 

The following residues of the BLC protease (SEQ ID NO: 1), are within a dis- 
5 tance of 6A from the peptide DAFE and thus believed to be involved in interactions with 
said substrate: 1,2, 31, 32, 47, 48, 88, 91, 93, 96, 162, 163, 164, 165, 166, 167, 168, 
190, 191, 192, 193, 194, 195, and 201. 

Helix capping: 

10 For the RP-II proteases helix capping may be obtained by modifying the position 

structurally corresponding to position 221 in BLC, and specifically in BLC by the modifi- 
cation A221N,T 

Removal of deamidation sites 

15 For the RP-II proteases, removal of deamidation sites may be obtained by modi- 

fying the positions structurally corresponding to positions 213, 216, and 222 of BLC, 
and specifically in BLC by the modifications. 

N213A I C,D I E,F,G,H,I,K,L,P,Q,R,S,T,V,Y,M,W preferably N213L,T,S 
N216A,C,D,E,F,G J H,I,K,L,P,Q,R,S,T,V,Y,M,W preferably N216L,T,S 
20 N222A,C,D,E,F,G,H,l,K 1 L t P f Q,R,S,T 1 V,Y,M,W preferably N222L,T,S 

Combined modifications 

The present invention also encompasses any of the above mentioned RP-II pro- 
tease variants in combination with any other modification to the amino acid sequence 
25 thereof. Especially combinations with other modifications known in the art to provide 
improved properties to the enzyme are envisaged. Such modifications to be combined 
with any of the above indicated modifications are exemplified in the following. 

Removal of critical oxidation sites 

30 In order to increase the stability of the RP-II protease it may be advantageous to 

substitute or delete critical oxidation sites, such as methionines, with other amino acid 
residues which are not subject to oxidation. 
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Accordingly, in a further embodiment the present invention relates to an RP-II 
protease variant, in which one or more amino acid residues susceptible to oxidation, 
especially methionine residues exposed to the surface of the molecule, is/are deleted or 
replaced with another amino acid residue less susceptible to oxidation. The amino acid 
5 residue less susceptible to oxidation may for instance be selected from the group 
consisting of A, E, N, Q, I, L, S and K. 

Specific such variants comprises at least one of the deletions or substitutions 
M36f 1 S,A t N 1 Q,K}; M160f ,SAN,Q,K} of the BLC protease; M144{*,SAN,Q,K} of the 
AC116 and CDJ31 proteases; M67{* t SAN ( Q,K} 9 M79f ,SAN,Q,K}, M137{* ,SAN,Q,K}, 
10 M144{*,SAN,Q,K}, and M171{*,S,A,N,Q,K} of the B032, BIP and JA96 proteases; 
M159f,SAN,Q,K} of the B032 protease; M81{*,SAN,Q,K}, and M141{*,SAN,Q,K} in 
the MPR protease; and M17{*,S,A,N,Q,K}, M67{*,S,A,N,Q,K} f M144{*,SAN,Q,K}, 
M160{\SAN,Q,K}, M186f,SAN,Q,K}, and M217{*,SAN,Q,K} of the AA513 protease 
(positions are indicated in relation to the BLC protease as indicated in Fig. 2). 

15 

Modification of Asn-Gly sequences in the protease 

It is known that at alkaline pH, the side chain of Asn may interact with the NH 
group of a sequential neighboring amino acid to form an isoAsp residue where the 
backbone goes through the Asp side chain. This will leave the backbone more vulnerable 
20 to proteolysis. The deamidation is much more likely to occur if the residue that follows is a 
Gly. Changing the Asn in front of the Gly or the Gly will prevent this from happening and 
thus improve the stability, especially as concerns thermo- and storage stability. 

The invention consequently further relates to an RP-II protease variant, in which 
either or both residues of any of the Asn-Gly sequence appearing in the amino acid 
25 sequence of the parent RP-II protease is/are deleted or substituted with a residue of a 
different amino acid. 

The Asn and/or Gly residue may, for instance, be substituted with a residue of an 
amino acid selected from the group consisting of A, Q, S, P, T and Y. 

More specifically, any of the Asn or Gly residues of the Asn-Gly occupying 
30 positions 68-69, 182-183 and/or 192-193 of the BLC protease; positions 68-69 and/or 
192-193 of the AC116 and CDJ-31 proteases, positions 45-46, 74-75, 196-197, and/or 
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201-202 of the B032, JA96 and BIP proteases, positions 68-69, 103-104 and/or 192-196 
of the MPR protease; and positions 90-91 and/or 201-202 of the AA513 protease, may be 
deleted or substituted with a residue of an amino acid selected from the group consisting 
of A, Q, S, P, T and Y. (positions are indicated in relation to the BLC protease as indicated 
5 in Fig. 2) 

Specific variants of BLC are: 

N68f,A,Q,S,P,T,Y}; G69r,A,Q,S,P,T,Y> 

N68r,A,Q,S,P,T,Y}+G69r,A,Q,S,P,T,Y} 

N182{*,A,Q,S,P,T,Y}; G183f ,A,Q,S,P,T,Y} 

10 N 1 82{*,A,Q,S ,P,T.Y}+G 1 83f ,A,Q,S, P.T.Y} 

N192{*,A,Q,S,P,T,Y}; G193r,A,Q,S,P,T,Y} 

N 1 92{*,A,Q,S,P,T,Y}+G 1 93{\A,Q,S , P,T,Y} 

and combinations thereof. 

Specific variants of the AC1 16 and CDJ-31 proteases are: 
15 N68{*,A ) Q,S,P,T,Y}; GegfAQ.S.P.T.Y} 

N68{*,A,Q,S I P,T ) Y}+G69r,A,Q,S,P,T,Y} 

N192f,A,Q,S,P,T,Y}; G193{*,A,Q,S,P,T,Y} 

N192{*,A,Q,S,P,T,Y}+G193{*,A,Q,S,P,T,Y} 

N68r,A,Q,S,P,T,Y}+N192{*,A,Q,S,P l T,Y} 
20 and combinations thereof. 

Specific variants of BQ32. JA96 and BIP proteases are: 

N45r,A,Q,S,P,T,Y}; G46f ,A,Q,S ( P,T,Y} 

N45r,A,Q,S,P 1 T,Y}+G46{*,A,Q,S,P,T,Y} 
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N74fAQ,S,P,T,Y}; G75CAQ,S,P,T,Y} 

N74{*,A,Q,S,P I T,Y}+G75{*AQ,S,P,T,Y} 

N196fAQ,S,P,T,Y}; G197{*AQ,S,P,T,Y} 

N196rAQ,S I P,T > Y}+G197rAQ,S 1 P,T,Y} 
5 N201{*AQ,S,P,T,Y}; G202{*AQ,S,P,T,Y} 

N20ir,A,Q,S,P 1 T,Y} + G202{*AQ ( S,P,T,Y} 

N45r,A 1 Q,S > P 1 T,Y}+N74r,A ( Q,S,P,T,Y} 

N45r f A ( Q,S,P,T,Y}+N196{*,A > Q,S ) P I T,Y} 

N45{*,A ( Q,S,P > T,Y}+N201{*,A,Q,S,P,T,Y} 
10 N74r,A,Q,S,P,T,Y}+N196r,A,Q I S,P,T,Y} 

N74{*AQ,S,P,T,Y}+N201f AQ.S.P.T.Y} 

N 1 96{* AQ.S ,P,T,Y}+N201{*,A,Q f S,P,T,Y} 

N45r,A ) Q,S,P l T,Y}+N74{*,A,Q > S,P,T,Y}+N196rAQ,S,P,T,Y} 

N45r,A ( Q ( S.P,T.Y}+N74{* > A,Q,S,P,T,Y}+N201{*,A,Q,S,P,T,Y} 
15 N45f > A I Q,S,P l T,Y}+N196r,A,Q,S,P,T I Y}+N20ir,A l Q,S,P,T,Y> 

N74{*,A l Q,S,P,T,Y}+N196r j A,Q,S f P,T,Y}+N201{*AQ ) S,P,T ) Y} 

N45r,A,Q l S,P,T,Y}+N74r,A > Q,S,P,T,Y}+N196{*,A,Q,S,P,T,Y}+N201f AQ,S,P,T,Y} 

and combinations thereof. 

Specific variants of AA51 3 are: 
20 N90{*AQ,S,P,T,Y}; GgifAQ.S.P.T.Y} 

N90^A,Q,S I P,T,Y}+G91{*AQ,S,P,T,Y} 
N20ir,A,Q ) S,P,T ) Y}; G202f AQ.S.P.T.Y} 
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N201{*AQ,S,P > T,Y}+G202{*AQ f SP,T,Y} 
N90fAQ l SP,T,Y}+N201{*AQ l S l P,T > Y} 
and combinations thereof. 
Specific variants of MPR are: 
5 N68{*AQ,S f P,T,Y}; G69{*AQ,S,P,T,Y} 

n68^aaspt,y}+g69{%aas,p,t,y} 

NIOSrAQ^^P^Y}; G104{*AQ,S,P,T,Y} 

NIOSr.A.Q.SPT^+GIO^AQ.SPJ.Y} 

N192f AQ,S,P,T,Y}; G196{*AQ,S,P t T,Y} 
10 N192fAQ,S l P,T f Y}+G196{*AQ,S J P,T l Y} 

Nesr.A^^P.T^+NIOSrAQ^.P.T.Y} 

N68rAQ,S,P,T,Y}+N192rAQ,S,P,T,Y} 

N103{*AQ,S,P,T,Y}+N192{*AQ I S,P,T I Y} 

N68r,A ) Q,S,P,T I Y}+N103rAQ,S t P ) T l Y}+N192r,A,Q T S,P,T,Y} 
15 and combinations thereof. 

Removal of autoproteolysis sites 

According to a further aspect of the invention autoproteolysis sites may be 
removed by changing the amino acids at an autoproteolysis site. Since the RP-II 
proteases cleaves at Glu and Asp residues it is preferred to modify such residues of a 
20 parent RP-II protease having the same or a similar specificity, preferably by substituting 
with any other amino acid except Glu. 

The parent RP-II proteases are mostly specific towards Glu and to a minor extent 
towards Asp residues. Therefore the modification of the parent (trypsin-like) RP-II 
protease may preferably be made by changing Glu to another amino acid residue 
25 (including Asp). Experiments have indicated that the substitution of Ala for Glu or Asp 
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provides good results. 

Glu and Asp residue are in the BLC, CDJ31 and AC116 proteases found in 
positions E101, ,E152, E173, E209, D6, D51, D96, D135, D161, and D212. BLC has a 
further Glu in position E104 and Asp in D7. 

Specific BLC, CDJ31 and AC116 variants are thus E101A, E152A, E173A, E209A, 
D6A, D51A, D135A, D161A, D212A, and double, triple, quadruple, etc. combinations 
thereof. Further specific BLC variants are E104A and D7A. 

In JA96, B032 and BIP Glu and Asp are found at positions E81, E 143, E 151, 
E209, D5, D6, D69, D96, D103, D135, D152, D161, and D173. 

Specific JA96, B032 and BIP variants are thus E81A, E143A, E151A, E202A, 
D5A, D6A, D69A, D96A, D103A, D135A, D152A, D161A, D173A, and double, triple, 
quadruple, etc. combinations thereof. 

In MPR Glu and Asp are found at positions E7, E89a, E152, D6, D54, D92, D96, 
D135, D144, D161, D177 and D209 

Specific MPR variants are thus E7A, E89aA, E152A, D6A, DMA, D92A, D96A, 
D135A, D144A, D161A, D177A and D209A, and double, triple, quadruple, etc. 
combinations thereof. 

In AA513 Glu and Asp are found at positions E26, E55, E94, E117, E123, E137b, 
E199, D40, D96, D103b, D103d, D135, D149, D154, D161, D184 and D209 

Specific AA513 variants are thus E26A, E55A, E94A, E117A, E123A, E137bA, 
E199A, D40A, D96A, D103bA, D103dA, D135A, D149A. D154A, D161A, D184A and 
D209A, and double, triple, quadruple, etc. combinations thereof. 

Corresponding variants are easily identified in any other RP-II protease. 

Alternatively autoproteolysis can be prevented by changing the amino acid residue 
occupying the 1 st and/or 2nd position following the Glu or Asp residue in question to Pro. 
For instance, this may in BLC, CDJ31 and AC116 be done in the positions 174 and/or 175 
as follows: 

Q174P; S175P; Q174P+S175P 
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or in a similar manner in JA96, B032 or BiP at positions 152 and/or 153 as D152P; 
T153P; or D152P+T153P. 

Corresponding variants are easily identified in these and any other RP-II protease. 

Modification of tryptophan residues 

5 In order to stabilize the protein it may be advantageous to replace or delete 

tryptophan residues at the surface of the protein, e.g., as described in US 5,118,623. The 
tryptophan residues may advantageously be substituted for F, T, Q or G. Thus, in a 
further embodiment the invention relates to an RP-II variant comprising one or more of the 
following substitutions: 

10 BLC andAC116: 

W35{F,T,Q,G}; W88{F,T,Q,G}; W142{F,T,Q,G}; W217{F,T,Q,G} 

CDJ31: 



W142{F,T,Q,G}; W217{F,T,Q,G}; 
BQ32. JA96 and BIP: 
15 W142{F,T,Q,G}; 
AA513: 

W30{F,T,Q,G}; W72{F,T,Q,G}; W142{F,T,Q,G} 
MPR: 

W57{F,T,Q,G}; W88{F,T,Q,G}; W1 12{F,T,Q,G}; W142{F,T,Q,G}; W217{F,T,Q,G} 

20 Modification of tyrosines 

In relation to wash performance it has been found that the modification of certain 
tyrosine residues to phenylalanine provides an improved wash performance. Without be- 
ing bound by any specific theory, it is believed that titration of these Tyr residues in the 
alkaline wash liquor has negative effects that are alleviated by replacing the Tyr residues 
25 with other residues, especially Phe or Trp, particularly Phe. 



In the BLC, AC116 and CDJ31 parent RP-II proteases, the following tyrosine 

31 



10517.000-DK 



residues may be modified: 

19, 50, 72, 74, 82, 95, 97, 112, 115, 117, 132, 154, 163, 195, 200. In BLC and CDJ31 the 
tyrosines in positions 17 and 158 may also be modified , and in AC116 and CDJ31 the 
tyrosines in position 1 72 

Examples of specific variants comprise one or more of the following substitu- 
tions: 

Y17{F,W}, Y19{F,W},Y50{F,W}, Y72{F,W}, Y 74{F,W>, Y82{F,W}, Y88{F,W}, Y95{F,W}, 
Y97{F,W}, Y112{F,W}, Y115{F,W}, Y117{F,W}, Y132{F,W}, Y154{F,W}, Y158{F,W}, 
Y163{F,W}, Y172{F,W}, Y195{F,W}, Y200{F,W} 

In the JA96, B032 and BIP parent RP-II proteases, the following tyrosine resi- 
dues may be modified: 

19, 24, 50, 57, 64, 83, 88, 95, 112, 132, 157, 158, 195. 216 

Examples of specific JA96, B032 and BIP variants comprises one or more of the 
following substitutions: 

Y19{F,W}, Y24{F,W}, Y50{F,W}, Y57{F,W}, Y64{F,W}, Y83{F,W}, Y88{F,W}, Y95fF,W}, 
Y1 12{F,W}, Y132{F,W}, Y157{F,W}, Y158{F,W}, Y195{F,W} and Y216{F,W} 

In the AA513 parent RP-II protease, the following tyrosine residues may be 

modified: 

24, 74, 77, 84, 88, 97, 130, 132, 158, 163, 193a 

Examples of specific A A51 3 variants comprises o ne o r m ore of t he following 
substitutions: 

Y24{F,W}, Y74{F,W}, Y77{F,W}, Y84{F,W}, Y88{F,W}, Y97{F,W}, Y130{F,W}, Y132{F,W}, 
Y158{F,W}, Y163{F,W}, Y193A{F,W} 

In the MPR parent RP-II protease, the following tyrosine residues may be modi- 
fied: 

19, 28a, 30, 50, 72, 74, 77, 83, 95, 97, 113, 115, 154, 158, 163, 172, 175, 200, 216 

Examples of specific MPR variants comprises one or more of the following sub- 
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stitutions: 

Y19{F,W}, Y28Ad{F,W} f Y30{F,W}, Y5Q{F,W}, Y72{F,W}, Y74{F,W}, Y77{F,W}, Y83{F,W}, 
Y95{F,W}, Y97{F,W}, Y113{F,W}, 115{F,W}, Y154{F,W}, Y158{F,W}, Y163{F,W}, 
Y172{F,W}, Y175{F,W}, Y200{F,W}, Y216{F,W} 

5 Other modifications for combination 

Examples of specific BLC variants comprises one or more of the following sub- 
stitutions: 

E152{A,R,K,G} 

E173A 

10 E209A 

E152G+G164R 



METHODS OF PREPARING RP-H PROTEASE VARIANTS 

15 The RP-II protease variants of the present invention may be produced by any 

known method within the art. The invention also relates to polynucleotides encoding the 
RP-II protease variants of the present invention, DNA constructs comprising such 
polynucleotides and host cells comprising such constructs or polynucleotides. 

In general natural occurring proteins may be produced by culturing the organism 

20 expressing the protein and subsequently purifying the protein, or recombinantly by 
cloning a polynucleotide, e.g. genomic DNA or cDNA, encoding the protein into an ex- 
pression vector, introducing said expression vector into a host cell, culturing the host 
cell and purifying the expressed protein. 

25 site-directed mutagenesis 

Typically protein variants may be produced by site-directed mutagenesis of the 
gene encoding a parent protein, introduction of the mutated gene into an expression 
vector, host cell etc. The gene encoding the parent protein may be cloned from a strain 
producing the polypeptide or from an expression library, i.e. it may be isolated from ge- 
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nomic DNA or prepared from cDNA, or a combination thereof. The gene may even be a 
fully synthetically produced gene. 

In general standard procedures for cloning of genes and/or introducing muta- 
tions (random and/or site directed) into said genes may be used in order to obtain a 
5 parent RP-II protease, or RP-II protease variant of the invention. For further description 
of suitable techniques reference is made to Molecular cloning: A laboratory manual 
(Sambfook et al. (1989), Cold Spring Harbor lab., Cold Spring Harbor, NY; Ausubel, F. 
M. et al. (eds.)); Current protocols in Molecular Biology (John Wiley and Sons, 1995; 
Harwood, C. R., and Cutting, S. M. (eds.)); Molecular Biological Methods for Bacillus 

10 (John Wiley and Sons, 1990); DNA Cloning: A Practical Approach, Volumes I and II 
(D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid 
Hybridization (B.D. Hames & S.J, Higgins eds (1985)); Transcription And Translation 
(B.D. Hames & S.J. Higgins, eds. (1984)); Animal Cell Culture (R.I. Freshney, ed. 
(1986)); Immobilized Cells And Enzymes ( IRL Press, (1986)); A Practical Guide To 

15 Molecular Cloning (B. Perbal, (1984)) and WO 96/34946. 



Localized and region specific random mutagenesis 

Random mutagenesis is suitably performed either as localized or region-specific 
random mutagenesis in at least three parts of the gene translating to the amino acid 
20 sequence shown in question, or within the whole gene. 

The random mutagenesis of a DNA sequence encoding a parent RP-II protease 
may be conveniently performed by use of any method known in the art. 

In relation to the above, a further aspect of the present invention relates to a 
method for generating a variant of a parent RP-II protease wherein the variant exhibits 
25 an altered property, such as increased thermostability, increased stability at low pH and 
at low calcium concentration, relative to the parent RP-II protease, the method compris- 
ing: 

a) subjecting a DNA sequence encoding the parent protease to localized or region- 
specific random mutagenesis, 
30 b) expressing the mutated DNA sequence obtained in step (a) in a host cell, and 

c) screening for host cells expressing a RP-II protease variant which has an altered 
property relative to the parent RP-II protease. 

Step (a) of the above method of the invention is preferably performed using 
doped primers. 
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When the mutagenesis is performed by the use of an oligonucleotide, the oli- 
gonucleotide may be doped or spiked with the three non-parent nucleotides during the 
synthesis of the oligonucleotide at the positions that are to be changed. The doping or 
spiking may be done so that codons for unwanted amino acids are avoided. The doped 
5 or spiked oligonucleotide can be incorporated into the DNA encoding the RP-II prote- 
ase by any published technique, using, e.g., PGR, LCR or any DNA polymerase and 
ligase as deemed appropriate. 

Preferably, the doping is carried out using "constant random doping", in which 
the percentage of wild-type and modification in each position is predefined. Further- 

10 more, the doping may be directed toward a preference for the introduction of certain 
nucleotides, and thereby a preference for the introduction of one or more specific 
amino acid residues. The doping may be made, e.g., so as to allow for the introduction 
of 90% wild type and 10% modifications in each position. An additional consideration in 
the choice of a doping scheme is based on genetic as well as protein-structural con- 

15 straints. The doping scheme may be made by using the DOPE program which, inter 
alia, ensures that introduction of stop codons is avoided (L J. Jensen et al. Nucleic Acid 
Research, 26, 697-702 (1998). 

The D NA sequence to be mutagenized may c onveniently be present in a ge- 
nomic or cDNA library prepared from an organism expressing the parent RP-II prote- 

20 ase. Alternatively, the DNA sequence may be present on a suitable vector such as a 
plasmid or a bacteriophage, which as such may be incubated with or otherwise ex- 
posed to the mutagenizing agent. The DNA to be mutagenized may also be present in 
a host cell either by being integrated in the genome of said cell or by being present on 
a vector harboured in the cell. Finally, the DNA to be mutagenized may be in isolated 

25 form. It will be understood that the DNA sequence to be subjected to random 
mutagenesis is preferably a cDNA or a genomic DNA sequence. 

In some cases it may be convenient to amplify the mutated DNA sequence prior 
to performing the expression step b) or the screening step c). Such amplification may 
be performed in accordance with methods known in the art, the presently preferred 

30 method being PCR-generated amplification using oligonucleotide primers prepared on 
the basis of the DNA or amino acid sequence of the parent enzyme. 

Subsequent to the incubation with or exposure to the mutagenizing agent, the 
mutated DNA is expressed by culturing a suitable host cell carrying the DNA sequence 
under conditions allowing expression to take place. The host cell used for this purpose 

35 may be one which has been transformed with the mutated DNA sequence, optionally 
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10 



present on a vector, or one which was carried the DNA sequence encoding the parent 
enzyme during the mutagenesis treatment. Examples of suitable host cells are the fol- 
lowing: gram positive bacteria such as Bacillus subtilis, Bacillus licheniformis, Bacillus 
lentus, Bacillus brevis, Bacillus stearothermophilus, Bacillus alkalophilus, Bacillus amy- 
loliquefaciens, Bacillus coagulants, Bacillus circulans, Bacillus lautus, Bacillus 
megaterium, Bacillus thuringiensis, Streptomyces lividans or Streptomyces murinus) 
and gram negative bacteria such as E. coli. 

The mutated DNA sequence may further comprise a DNA sequence encoding 
functions permitting expression of the mutated DNA sequence. 

Localised random mutagenesis 



The random mutagenesis may be advantageously localised to a part of the par- 
ent RP-II protease in question. This may, e.g., be advantageous when certain regions 
of the enzyme have been identified to be of particular importance for a given property 

15 of the enzyme, and when modified are expected to result in a variant having improved 
properties. Such regions may normally be identified when the tertiary structure of the 
parent enzyme has been elucidated and related to the function of the enzyme. 

The localised or region-specific, random mutagenesis is conveniently performed 
by use of PGR generated mutagenesis techniques as described above or any other 

20 suitable technique known in the art. Alternatively, the DNA sequence encoding the part 
of the DNA sequence to be modified may be isolated, e.g., by insertion into a suitable 
vector, and said part may be subsequently subjected to mutagenesis by use of any of 
the mutagenesis methods discussed above. 

25 General method for localised random mutagenesis bv use of the DOPE program 

The localised random mutagenesis may be carried out by the following steps: 

1 . Select regions of interest for modification in the parent enzyme 

2. Decide on mutation sites and non-mutated sites in the selected region 

3. Decide on which kind of mutations should be carried out, e.g. with re- 
30 spect to the desired stability and/or performance of the variant to be 

constructed 

4. Select structurally based mutations 

5. Adjust the residues selected in step 3 with regard to step 4. 

6. Analyse by use of a suitable dope algorithm the nucleotide distribu- 
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tion. 

7. If necessary, adjust the wanted residues to genetic code realism, e.g. 
taking into account constraints resulting from the genetic code, e.g. in 
order to avoid introduction of stop codons; the skilled person will be 

5 aware that some codon combinations cannot be used in practice and 

will need to be adapted 

8. Make primers 

9. Perform localised random mutagenesis by use of the primers 

10. Select resulting RP-II protease variants by screening for the desired 
10 improved properties. 

Suitable dope algorithms for use in step 6 are well known in the art. One such 
algorithm is described by Tomandl, D. et al, 1997, Journal of Computer-Aided Molecu- 
lar Design 11:29-38. Another algorithm is DOPE (Jensen, LJ, Andersen, KV, Svend- 
15 sen, A, and Kretzschmar, T (1998) Nucleic Acids Research 26:697-702). 

Expression vectors 

A recombinant expression vector comprising a nucleic acid sequence encoding 
a RP-II protease variant of the invention may be any vector that may conveniently be 

20 subjected to recombinant DNA procedures and which may bring about the expression 
of the nucleic acid sequence. 

The choice of vector will often depend on the host cell into which it is to be intro- 
duced. Examples of a suitable vector include a linear or closed circular plasmid or a vi- 
rus, The vector may be an autonomously replicating vector, i.e., a vector which exists 

25 as an extra-chromosomal entity, the replication of which is independent of chromoso- 
mal replication, e.g., a plasmid, an extra-chromosomal element, a mini chromosome, or 
an artificial chromosome. The vector may contain any means for assuring self- 
replication. Examples of bacterial origins of replication are the origins of replication of 
plasmids pBR322, pUC19, pACYC177, pACYC184, pUB110, pE194, pTA1060, and 

30 pAMftl . Examples of origin of replications for use in a yeast host cell are the 2 micron 
origin of replication, the combination of CEN6 and ARS4, and the combination of CEN3 
and ARS1. The origin of replication may be one having a mutation which makes it func- 
tion as temperature-sensitive in the host cell (see, e.g., Ehrlich, 1978, Proceedings of 
the National Academy of Sciences USA 75:1433). 
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Alternatively, the vector may be one which, when introduced into the host cell, is 
integrated into the genome and replicated together with the chromosome(s) into which 
it has been integrated. Vectors which are integrated into the genome of the host cell 
may contain any nucleic acid sequence enabling integration into the genome; in par- 
5 ticular it may contain nucleic acid sequences facilitating integration into the genome by 
homologous or non-homologous recombination. The vector system may be a single 
vector, e.g. plasmid or virus, or two or more vectors, e.g. plasmids or virus', which to- 
gether contain the total DNA to be introduced into the genome of the host cell, or a 
transposon. 

10 The vector may in particular be an expression vector in which the DNA se- 

quence encoding the RP-II protease variant of the invention is operably linked to addi- 
tional segments or control sequences required for transcription of the DNA. The term, 
"operably linked" indicates that the segments are arranged so that they function in con- 
cert for their intended purposes, e.g. transcription initiates in a promoter and proceeds 

15 through the DNA sequence encoding the RP-II protease variant. Additional segments 
or control sequences include a promoter, a polyadenylation sequence, a propeptide 
sequence, a signal sequence and a transcription terminator. At a minimum the control 
sequences include a promoter and transcriptional and translational stop signals. 

The promoter may be any DNA sequence that shows transcriptional activity in 

20 the host cell of choice and may be derived from genes encoding proteins either ho- 
mologous or heterologous to the host cell. 

Examples of suitable promoters for use in bacterial host cells include the pro- 
moter of the Bacillus subtilis levansucrase gene (sacB), the Bacillus stearothermophilus 
maltogenic amylase gene (amyM), the Bacillus licheniformis alpha-amylase gene 

25 (amyL), the Bacillus amyloliquefaciens alpha-amylase gene (amyQ), the Bacillus sub- 
tilis alkaline protease gene, or the Bacillus pumilus xylosidase gene, the Bacillus amy- 
loliquefaciens BAN amylase gene, the Bacillus licheniformis penicillinase gene (penP), 
the Bacillus subtilis xylA and xylB genes, and the prokaryotic beta-lactamase gene 
(Villa-Kamaroff et a I., 1 978, Proceedings of the National Academy of Sciences USA 

30 75:3727-3731). Other examples include the phage Lambda P R or P L promoters or the 
E. coli lac, trp or tac promoters or the Streptomyces coelicolor agarase gene (dagA). 
Further promoters are described in "Useful proteins from recombinant bacteria" in Sci- 
entific American, 1980, 242:74-94; and in Sambrook et aL, 1989, supra. 

Examples of suitable promoters for use in a filamentous fungal host cell are 

35 promoters obtained from the genes encoding Aspergillus oryzae TAKA amylase, Rhi- 
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zomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergil- 
lus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoa- 
mylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease, Asper- 
gillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusariurn 
6 oxysporum trypsin-like protease (as described in U.S. Patent No. 4,288,627, which is 
incorporated herein by reference), and hybrids thereof. Particularly preferred promoters 
for use in filamentous fungal host ceils are the TAKA amylase, NA2-tpi (a hybrid of the 
promoters from the genes encoding Aspergillus niger neutral (-amylase and Aspergillus 
oryzae triose phosphate isomerase), and glaA promoters. Further suitable promoters 

10 for use in filamentous fungus host cells are the ADH3 promoter (McKnight et al., The 
EMBO J. 4 (1985), 2093 - 2099) or the tpiA promoter. 

Examples of suitable promoters for use in yeast host cells include promoters 
from yeast glycolytic genes (Hitzeman et al„ J. Biol. Chem. 255 (1980), 12073 - 12080; 
Alber and Kawasaki, J. Mol. Appl. Gen. 1 (1982), 419 -434) or alcohol dehydrogenase 

15 genes (Young et aL, in Genetic Engineering of Microorganisms for Chemicals (Hollaen- 
der et al, eds.), Plenum Press, New York, 1982), or the TPI1 (US 4,599,311) or ADH2- 
4c (Russell et aL, Nature 304 (1983), 652 - 654) promoters. 

Further useful promoters are obtained from the Saccharomyces cerevisiae 
enolase (ENO-1) gene, the Saccharomyces cerevisiae galactokinase gene (GAL1), the 

20 Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehy- 
drogenase genes (ADH2/GAP), and the Saccharomyces cerevisiae 3- 
phosphoglycerate kinase gene. Other useful promoters for yeast host cells are de- 
scribed by Romanos et al., 1992, Yeast 8:423-488. In a mammalian host cell, useful 
promoters include viral promoters such as those from Simian Virus 40 (SV40), Rous 

25 sarcoma virus (RSV), adenovirus, and bovine papilloma virus (BPV). 

Examples of suitable promoters for use in mammalian cells are the SV40 pro- 
moter (Subramani et aL, Mol. Cell Biol. 1 (1981), 854 -864), the MT-1 (metallothionein 
gene) promoter (Palmiter et al., Science 222 (1983), 809 - 814) or the adenovirus 2 
major late promoter. 

30 An example of a suitable promoter for use in insect cells is the polyhedrin pro- 

moter (US 4,745,051; Vasuvedan et al M FEBS Lett. 311, (1992) 7 - 11), the P10 pro- 
moter (J.M. Vlak et al., J. Gen. Virology 69, 1988, pp. 765-776), the Autographa califor- 
nica polyhedrosis virus basic protein promoter (EP 397 485), the baculovirus immediate 
early gene 1 promoter (US 5,155,037; US 5,162,222), or the baculovirus 39K delayed- 

35 early gene promoter (US 5,155,037; US 5,162,222). 
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The DNA sequence encoding a RP-II protease variant of the invention may also, 
if necessary, be operably connected to a suitable terminator. 

The recombinant vector of the invention may further comprise a DNA sequence 
enabling the vector to replicate in the host cell in question.. 
5 The vector may also comprise a selectable marker, e.g. a gene the product of 

which complements a defect in the host cell, or a gene encoding resistance to e.g. an- 
tibiotics like ampicillin, kanamycin, chloramphenicol, erythromycin, tetracycline, 
spectinomycine, neomycin, hygromycin, methotrexate, or resistance to heavy metals, 
virus or herbicides, or which provides for prototrophy or auxotrophs. Examples of bac- 

10 terial selectable markers are the dal genes from Bacillus subtitis or Bacillus licheni- 
formis, resistance. A frequently used mammalian marker is the dihydrofolate reductase 
gene ( DHFR). S uitable m arkers for yeast host cells a re A DE2, H IS3, L EU2, LYS2, 
MET3, TRP1, and URA3. A selectable marker for use in a filamentous fungal host cell 
may be selected from the group including, but not limited to, amdS (acetamidase), argB 

15 (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hygB (hy- 
gromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5-phosphate 
decarboxylase), sC (sulfate adenyltransferase), trpC (anthranilate synthase), and glu- 
fosinate resistance markers, as well as equivalents from other species. Particularly, for 
use in an Aspergillus cell are the amdS and pyrG markers of Aspergillus nidulans or 

20 Aspergillus oryzae and the bar marker of Streptomyces hygroscopicus. Furthermore, 
selection may be accomplished by co-transformation, e.g., as described in WO 
91/17243, where the selectable marker is on a separate vector. 

To direct a RP-II protease variant of the present invention into the secretory 
pathway o f t he h ost c ells, a s ecretory s ignal s equence ( also k nown a s a I eader s e- 

25 quence, prepro sequence or pre sequence) may be provided in the recombinant vector. 
The secretory signal sequence is joined to the DNA sequence encoding the enzyme in 
the correct reading frame. Secretory signal sequences are commonly positioned 5' to 
the DNA sequence encoding the enzyme. The secretory signal sequence may be that 
normally associated with the enzyme or may be from a gene encoding another se- 

30 creted protein. 

The procedures used to ligate the DNA sequences coding for the present en- 
zyme, the promoter and optionally the terminator and/or secretory signal sequence, re- 
spectively, or to assemble these sequences by suitable PGR amplification schemes, 
and to insert them into suitable vectors containing the information necessary for replica- 
35 tion or integration, are well known to persons skilled in the art (cf., for instance, Sam- 
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brook et aL). 

More than one copy of a nucleic acid sequence encoding an enzyme of the pre- 
sent invention may be inserted into the host cell to amplify expression of the nucleic 
acid sequence. Stable amplification of the nucleic acid sequence can be obtained by 
integrating at least one additional copy of the sequence into the host cell genome using 
methods well known in the art and selecting for transformants. 

The nucleic acid constructs of the present invention may also comprise one or 
more nucleic acid sequences which encode one or more factors that are advantageous 
in the expression of the polypeptide, e.g., an activator (e.g., a trans-acting factor), a 
chaperone, and a processing protease. Any factor that is functional in the host cell of 
choice may be used in the present invention. The nucleic acids encoding one or more 
of these factors are not necessarily in tandem with the nucleic acid sequence encoding 
the polypeptide. 

Host cells 

The DNA sequence encoding a RP-II protease variant of the present invention 
may be either homologous or heterologous to the host cell into which it is introduced. If 
homologous to the host cell, i.e. produced by the host cell in nature, it will typically be 
operably connected to another promoter sequence or, if applicable, another secretory 
signal sequence and/or terminator sequence than in its natural environment. The term 
"homologous" is intended to include a DNA sequence encoding an enzyme native to 
the host organism in question. The term "heterologous" is intended to include a DNA 
sequence not expressed by the host cell in nature. Thus, the DNA sequence may be 
from another organism, or it may be a synthetic sequence. 

The host cell into which the DNA construct or the recombinant vector of the in- 
vention is introduced may be any cell that is capable of producing the present RP-II 
protease variants, such as prokaryotes, e.g. bacteria or eukaryotes, such as fungal 
cells, e.g. yeasts or filamentous fungi, insect cells, plant cells or mammalian cells. 

Examples of bacterial host cells which, on cultivation, are capable of producing 
the RP-II protease variants of the invention are gram-positive bacteria such as strains 
of Bacillus, e.g. strains of B. subtilis, B. licheniformis, B. Ientus t B. brevis, B. 
stearothermophilus, B. aikalophilus, B. amyfoliquefaciens, B. coagulans, B. circulans, 
B. lautus, B. megaterium or B. thuringiensis, or strains of Streptomyces, such as S. livi- 
dans or S. murinus, or gram-negative bacteria such as Escherichia coli or Pseudorno- 
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nas sp. 

The transformation of the bacteria may be effected by protoplast transformation, 
electroporation, conjugation, or by using competent cells in a manner known per se (cf. 
Sambrook et al., supra). 
5 When expressing the RP-II protease variant in bacteria such as £ coli, the en- 

zyme may be retained in the cytoplasm, typically as insoluble granules (known as in- 
clusion bodies), or it may be directed to the periplasmic space by a bacterial secretion 
sequence. In the former case, the cells are lysed and the granules are recovered and 
denatured after which the enzyme is refolded by diluting the denaturing agent. In the 

10 latter case, the enzyme may be recovered from the periplasmic space by disrupting the 
cells, e.g. by sonication or osmotic shock, to release the contents of the periplasmic 
space and recovering the enzyme. 

When expressing the RP-II protease variant in gram-positive bacteria such as 
Bacillus or Streptomyces strains, the enzyme may be retained in the cytoplasm, or it 

15 may be directed to the extracellular medium by a bacterial secretion sequence. In the 
latter case, the enzyme may be recovered from the medium as described below. 

Examples of host yeast cells include cells of a species of Candida, Kluyveromy- 
ces, Saccharomyces, Schizosaccharomyces, Pichia, Hansehula, or Yarrowia. In a par- 
ticular embodiment, the yeast host cell is a Saccharomyces carlsbergensis, Saccharo- 

20 myces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, Saccharo- 
myces kluyveri, Saccharomyces norbensis or Saccharomyces oviformis cell. Other 
useful yeast host cells are a Kluyveromyces lactis, Kluyveromyces fragilis, Hansehula 
polymorpha, Pichia pastoris, Yarrowia lipolytica, Schizosaccharomyces pombe, Ustilgo 
maylis, Candida maltose, Pichia guillermondii and Pichia methanolio cell (cf. Gleeson et 

25 al., J . G en. M icrobiol. 1 32, 1 986, p p. 3459-3465; US4 ,882,279 a nd U S 4 ,879,231 ). 
Since the classification of yeast may change in the future, for the purposes of this in- 
vention, yeast shall be defined as described in Biology and Activities of Yeast (Skinner, 
F.A., Passmore, S.M., and Davenport, R.R., eds, Soc. App. BacterioL Symposium Se- 
ries No.9, 1 980. T he b iology of yeast a nd m anipulation of yeast g enetics a re well 

30 known in the art (see, e.g., Biochemistry and Genetics of Yeast, Bacil, M., Horecker, 
BJ., and Stopani, A.O.M., editors, 2nd edition, 1987; The Yeasts, Rose, A.H., and Har- 
rison, J .S., editors, 2nd edition, 1 987; and The M olecular B iology of the Yeast Sac- 
charomyces, Strathern et al., editors, 1981). Yeast may be transformed using the pro- 
cedures described by Becker and Guarente, In Abelson, J.N. and Simon, M.I., editors, 

35 Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Volume 194, 
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pp 182-187, Academic Press, Inc., New York; Ito et al., 1983, Journal of Bacteriology 
153:163; and Hinnen et aL, 1978, Proceedings of the National Academy of Sciences 
USA 75:1920. 

Examples of filamentous fungal cells include filamentous forms of the subdivi- 
5 sion Eumycota and Oomycota (as defined by Hawksworth et aL, 1995, supra), in par- 
ticular it may of the a cell of a species of Acremoniurn, such as A. chrysogenum, As- 
pergillus, such as A. awamori, A. foetidus, A. japonicus, A. niger, A. nidulans or A. 
oryzae, Fusarium, such as F. bactridioides, F. cerealis* F. crookwellense, F. culmorurn, 
F. graminearum, F, graminum, F. heterosporum, F. negundi, F, reticulatum, F. roseum, 

10 F. sambucinum, F. sarcochroum, F. sulphureum, F. trichothecioides or F. oxysporum, 
Humicola, such as H. insolens or H. ianuginose, Mucor, such as M. miehei, My- 
celiophthora, such as M. thermophilum, Neurospora, such as A/, crassa, Penicillium, 
such as P. purpurogenum, Thielavia, such as T. terrestris, Tolypocladiurn, or Tricho- 
derma, such as T. harzianum, T. koningii, T. longibrachiatum, T. reesei or T. viride t or a 

15 teleomorph or synonym thereof. The use of Aspergillus spp. for the expression of pro- 
teins is described in, e.g., EP 272 277, EP 230 023. 

Examples of insect cells include a Lepidoptera cell line, such as Spodoptera 
frugiperda cells or Trichoplusia ni cells (cf. US 5,077,214). Culture conditions may 
suitably be as described in WO 89/01029 or WO 89/01 028Transformation of insect 

20 cells and production of heterologous polypeptides therein may be performed as de- 
scribed in US 4,745,051; US 4, 775, 624; US 4,879,236; US 5,155,037; US 5,162,222; 
EP 397,485). 

Examples of mammalian cells include Chinese hamster ovary (CHO) cells, HeLa 
cells, baby hamster kidney (BHK) cells, COS cells, or any number of other immortalized 

25 cell lines available, e.g., from the American Type Culture Collection. Methods of trans- 
fecting m ammalian c ells a nd e xpressing DNA s equences i ntroduced i n t he c ells a re 
described in e.g. Kaufman and Sharp, J. Mol. BioL 159 (1982), 601 - 621; Southern and 
Berg, J. Mol. Appl. Genet. 1 (1982), 327 - 341; Loyter et al., Proc. Natl. Acad. Sci. USA 
79 (1982), 422 - 426; Wigler et aL, Cell 14 (1978), 725; Corsaro and Pearson, Somatic 

30 Cell Genetics 7 (1981), 603, Ausubel et aL, Current Protocols in Molecular Biology, 
John Wiley and Sons, Inc., N.Y., 1987, Hawley-Nelson et aL, Focus 15 (1993), 73; Cic- 
carone et aL, Focus 15 (1993), 80; Graham and van der Eb, Virology 52 (1973), 456; 
and Neumann et aL, EMBO J. 1 (1982), 841 - 845. Mammalian cells may be trans- 
fected by direct uptake using the calcium phosphate precipitation method of Graham 

35 and Van der Eb (1978, Virology 52:546). 
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Methods for expression and isolation of proteins 

To express an enzyme of the present invention the above mentioned host cells 
transformed or transfected with a vector comprising a nucleic acid sequence encoding 
5 an enzyme of the present invention are typically cultured in a suitable nutrient medium 
under conditions permitting the production of the desired molecules, after which these 
are recovered from the cells, or the culture broth. 

The medium used to culture the host cells may be any conventional m edium 
suitable for growing the host cells, such as minimal or complex media containing ap- 

10 propriate supplements. Suitable media are available from commercial suppliers or may 
be prepared according to published recipes (e.g. in catalogues of the American Type 
Culture Collection). The media may be prepared using procedures known in the art 
(see, e.g., references for bacteria and yeast; Bennett, J.W. and LaSure, L, editors, 
More Gene Manipulations in Fungi, Academic Press, CA, 1991). 

15 If the enzymes of the present invention are secreted into the nutrient medium, 

they may be recovered directly from the medium. If they are not secreted, they may be 
recovered from cell lysates. The enzymes of the present invention may be recovered 
from the culture medium by conventional procedures including separating the host cells 
from the medium by centrifugation or filtration, precipitating the proteinaceous compo- 

20 nents of the supernatant or filtrate by means of a salt, e.g. ammonium sulphate, purifi- 
cation by a variety of chromatographic procedures, e.g. ion exchange chromatography, 
gel filtration chromatography, affinity chromatography, or the like, dependent on the en- 
zyme in question. 

The enzymes of the invention may be detected using methods known in the art 
25 that are specific for these proteins. These detection methods include use of specific an- 
tibodies, formation of a product, or disappearance of a substrate. For example, an en- 
zyme assay may be used to determine the activity of the molecule. Procedures for de- 
termining various kinds of activity are known in the art. 

The enzymes of the present invention may be purified by a variety of procedures 
30 known in the art including, but not limited to, chromatography (e.g., ion exchange, affin- 
ity, hydrophobic, chromatofocusing, and size exclusion), electroph orotic procedures 
(e.g., preparative isoelectric focusing (IEF), differential solubility (e.g., ammonium sul- 
fate precipitation), or extraction (see, e.g., Protein Purification, J-C Janson and Lars 
Ryden, editors, VCH Publishers, New York, 1989). 



44 



10517.000-DK 



When an expression vector comprising a DNA sequence encoding an enzyme of 
the present invention is transformed/transfected into a heterologous host cell it is pos- 
sible to enable heterologous recombinant production of the enzyme. An advantage of 
using a heterologous host cell is that it is possible to make a highly purified enzyme 
composition, characterized in being free from homologous impurities, which are often 
present when a protein or peptide is expressed in a homologous host cell. In this con- 
text homologous impurities mean any impurity (e.g. other polypeptides than the en- 
zyme of the invention) which originates from the homologous cell where the enzyme of 
the invention is originally obtained from. 

DETERGENT APPLICATIONS 

The enzyme of the invention may be added to and thus become a component of 
a detergent composition. 

The detergent composition of the invention may for example be formulated as a 
hand or machine laundry detergent composition including a laundry additive composi- 
tion suitable for pre-treatment of stained fabrics and a rinse added fabric softener com- 
position, or be formulated as a detergent composition for use in general household 
hard surface cleaning operations, or be formulated for hand or machine dishwashing 
operations. 

In a specific aspect, the invention provides a detergent additive comprising the 
enzyme of the invention. The detergent additive as well as the detergent composition 
may comprise one or more other enzymes such as a protease, a lipase, a cutinase, an 
amylase, a carbohydrase, a cellulase, a pectinase, a mannanase, an arabinase, a ga- 
la eta nase, a xylanase, an oxidase, e.g., a laccase, and/or a peroxidase. 

In general the properties of the chosen enzyme(s) should be compatible with the 
selected detergent, (i.e. pH-optimum, compatibility with other enzymatic and non- 
enzymatic ingredients, etc.), and the enzyme(s) should be present in effective amounts. 

Proteases : 

Suitable proteases include those of animal, vegetable or microbial origin. Micro- 
bial origin is preferred. Chemically modified or protein engineered mutants are in- 
cluded. The protease may be a serine protease or a metallo protease, preferably an 
alkaline microbial protease or a trypsin-like protease. Examples of alkaline proteases 
are subtilisins, especially those derived from Bacillus, e.g., subtilisin Novo, subtilisin 
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Carlsberg, subtilisin 309, subtilisin 147 and subtilisin 168 (described in WO 89/06279). 
Examples of trypsin-like proteases are trypsin (e.g. of porcine or bovine origin) and the 
Fusarium protease described in WO 89/06270 and WO 94/25583. 

Examples of useful proteases are the variants described in WO 92/19729, WO 
5 98/201 15, WO 98/201 16, and WO 98/34946, especially the variants with substitutions 
in one or more of the following positions: 27, 36, 57, 76, 87, 97, 101, 104, 120, 123, 
167, 170, 194, 206, 218, 222, 224, 235 and 274. 

Preferred commercially available protease enzymes include Alcalase™, Savi- 
nase™, Primase™, Duralase™, Esperase™, and Kannase™ (Novozymes MS), 
10 Maxatase™, Maxacal™, Maxapem™, Properase™, Purafect™, Purafect OxP™, FN2™, 
and FN3™ (Genencor International Inc.). 

Lipases : 

Suitable lipases include those of bacterial or fungal origin. Chemically modified 
15 or protein engineered mutants are included. Examples of useful lipases include lipases 
from Humicola (synonym Thermomyces), e.g. from H. lanuginosa (T. ianuginosus) as 
described in EP 258 068 and EP305 216 or from H. insolens as described in WO 
96/13580, a Pseudomonas lipase, e.g. from P. alcaligenes or P. pseudoalcaligenes 
(EP 2 18 272), P. cepacia (EP 331 376), P. stutzeri (GB 1 ,372,034), P. fluorescens, 
20 Pseudomonas sp. strain SD 705 (WO 95/06720 and WO 96/27002), P. wisconsinensis 
(WO 96/12012), a Bacillus lipase, e.g. from B. subtifis (Dartois et al. (1993), Biochemica 
et Biophysica Acta, 1131, 253-360), B. stearothermophiius (JP 64/744992) or B. 
pumilus (WO 91/16422). 

Other examples are lipase variants such as those described in WO 92/05249, 
25 WO 94/01 541 , EP 407 225, EP 260 105, WO 95/35381 , WO 96/00292, WO 95/30744, 
WO 94/25578, WO 95/14783, WO 95/22615, WO 97/04079 and WO 97/07202. 

Preferred commercially available lipase enzymes include Lipolase™ and Lipo- 
lase Ultra™ (Novozymes A/S). 

30 Amylases: 

Suitable amylases (a and/or B) include those of bacterial or fungal origin. Chemi- 
cally modified or protein engineered mutants are included. Amylases include, for 
example, a-amylases obtained from Bacilius, e.g. a special strain of B. licheniformis, 
described in more detail in GB 1,296,839. 
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Examples of useful amylases are the variants described in WO 94/02597, WO 
94/18314, WO 96/23873, and WO 97/43424, especially the variants with substitutions 
in one or more of the following positions: 15, 23, 105, 106, 124, 128, 133, 154, 156, 
181, 188, 190, 197, 202, 208, 209, 243, 264, 304, 305, 391, 408, and 444. 

Commercially available amylases are Duramyl™, Termamyl™, Fungamyl™ and 
BAN™ (Novozymes A/S), Rapidase™ and Purastar™ (from Genencor International 
Inc.). 

Cellulases : 

Suitable cellulases include those of bacterial or fungal origin. Chemically modi- 
fied or protein engineered mutants are included. Suitable cellulases include cellulases 
from the genera Bacillus, Pseudomonas, Humicola, Fusariurn, Thielavia, Acremonium, 
e.g. the fungal cellulases produced from Humicola insolens, Myceliophthora thermo- 
phila and Fusariurn oxysporum disclosed in US 4,435,307, US 5,648,263, US 
5,691,178, US 5,776,757 and WO 89/09259. 

Especially suitable cellulases are the alkaline or neutral cellulases having colour 
care benefits. Examples of such cellulases are cellulases described in EP 0 495 257, 
EP 0 531 372, WO 96/1 1262, WO 96/29397, WO 98/08940. Other examples are cellu- 
lase variants such as those described in WO 94/07998, EP 0 531 315, US 5,457,046, 
US 5,686,593, US 5,763,254, WO 95/24471, WO 98/12307 and PCT/DK98/00299. 

Commercially available cellulases include Celluzyme™, and Carezyme™ (No- 
vozymes A/S), Clazinase™, and Puradax HA™ (Genencor International Inc.), and KAC- 
500(B)™ (Kao Corporation). 

Peroxidases/Oxidases: 

Suitable peroxidases/oxidases include those of plant, bacterial or fungal origin. 
Chemically modified or protein engineered mutants are included. Examples of useful 
peroxidases include peroxidases from Coprinus, e.g. from C. cinereus, and variants 
thereof as those described in WO 93/24618, WO 95/10602, and WO 98/15257. 

Commercially available peroxidases include Guardzyme™ (Novozymes A/S). 

The detergent enzyme(s) may be included in a detergent composition by adding 
separate additives containing one or more enzymes, or by adding a combined additive 
comprising all of these enzymes. A detergent additive of the invention, i.e. a separate 
additive or a combined additive, can be formulated e.g. as a granulate, a liquid, a 
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slurry, etc. Preferred detergent additive formulations are granulates, in particular non- 
dusting granulates, liquids, in particular stabilized liquids, or slurries. 

Non-dusting granulates may be produced, e.g., as disclosed in US 4,106,991 
and 4,661 ,452 and may optionally be coated by methods known in the art. Examples of 
5 waxy coating m aterials a re p oly(ethylene oxide) p roducts ( polyethylene g lycol, P EG) 
with mean molar weights of 1000 to 20000; ethoxylated nonylphenols having from 16 to 
50 ethylene oxide units; ethoxylated fatty alcohols in which the alcohol contains from 12 
to 20 carbon atoms and in which there are 15 to 80 ethylene oxide units; fatty alcohols; 
fatty acids; and mono- and di- and triglycerides of fatty acids. Examples of film-forming 

10 coating materials suitable for application by fluid bed techniques are given in GB 
1483591. Liquid enzyme preparations may, for instance, be stabilized by adding a 
polyol such as propylene glycol, a sugar or sugar alcohol, lactic acid or boric acid ac- 
cording to established methods. Protected enzymes may be prepared according to the 
method disclosed in EP 238,216. 

15 The detergent composition of the invention may be in any convenient form, e.g., 

a bar, a tablet, a powder, a granule, a paste or a liquid. A liquid detergent may be 
aqueous, typically containing up to 70 % water and 0-30 % organic solvent, or non- 
aqueous. 

The detergent composition comprises one or more surfactants, which may be 
20 non-ionic including semi-polar and/or anionic and/or cationic a nd/or zwitterionic. The 
surfactants are typically present at a level of from 0.1 % to 60% by weight. 

When included therein the detergent will usually contain from about 1% to about 
40% of an anionic surfactant such as linear alkylbenzenesulfonate, alpha- 
olefinsulfonate, alkyl sulfate (fatty alcohol sulfate), alcohol ethoxysulfate, secondary al- 
25 kanesulfonate, alpha-sulfo fatty acid methyl ester, alkyl- or alkenylsuccinic acid or soap. 

When included therein the detergent will usually contain from about 0.2% to 
about 40% of a non-ionic surfactant such as alcohol ethoxylate, nonylphenol ethoxy- 
late, alkylpolyglycoside, alkyldimethylamineoxide, ethoxylated fatty acid monoethanol- 
amide, fatty acid monoethanolamide, polyhydroxy alkyl fatty acid amide, or N-acyl N- 
30 alkyl derivatives of glucosamine ("glucamides"). 

The detergent may contain 0-65 % of a detergent builder or complexing agent 
such as zeolite, d iphosphate, triphosphate, phosphonate, carbonate, citrate, n itrilotri- 
acetic acid, ethylenediaminetetraacetic acid, diethylenetriaminepentaacetic acid, alkyl- 
or alkenylsuccinic acid, soluble silicates or layered silicates (e.g. SKS-6 from Hoechst). 
35 The detergent may comprise one or more polymers. Examples are carboxy- 
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methylcellulose, poly(vinylpyrrolidone), poly (ethylene glycol), polyvinyl alcohol), 
poly(vinylpyridine-N-oxide), poly(vinylimidazole), polycarboxylates such as polyacry- 
lates, maleic/acrylic acid copolymers and lauryl methacrylate/acrylic acid copolymers. 

The detergent may contain a bleaching system which may comprise a H 2 0 2 
5 source such as perborate or percarbonate which may be combined with a peracid- 
forming bleach activator such as tetraacetylethylenediamine or nonanoyloxyben- 
zenesulfonate. Alternatively, the bleaching system may comprise peroxyacids of e.g. 
the amide, imide, or sulfone type. 

The enzyme(s) of the detergent composition of the invention may be stabilized 
10 using conventional stabilizing agents, e.g., a polyol such as propylene glycol or glyc- 
erol, a sugar or sugar alcohol, lactic acid, boric acid, or a boric acid derivative, e.g., an 
aromatic borate ester, or a phenyl boronic acid derivative such as 4-formylphenyI bo- 
ra nic acid, and the composition may be formulated as described in e.g. WO 92/19709 
and WO 92/19708. 

15 The detergent may also contain other conventional detergent ingredients such 

as e.g. fabric conditioners including clays, foam boosters, suds suppressors, anti- 
corrosion agents, soil-suspending agents, anti-soil redeposition agents, dyes, bacteri- 
cides, optical brighteners, hydrotropes, tarnish inhibitors, or perfumes. 

It is at present contemplated that in the detergent compositions any enzyme, in 

20 particular the enzyme of the invention, may be added in an amount corresponding to 
0.01-100 mg of enzyme protein per litre of wash liquor, preferably 0.05-5 mg of enzyme 
protein per litre of wash liquor, in particular 0.1-1 mg of enzyme protein per litre of wash 
liquor. 

The enzyme of the invention may additionally be incorporated in the detergent 
25 formulations disclosed in WO 97/07202 which is hereby incorporated as reference. 

FOOD PROCESSING APPLICATIONS 

The RP-II protease variants of the present invention may also be used in the 
processing of food, especially in the field of diary products, such as milk, cream and 
30 cheese, but also in the processing of meat and vegetables. 

FEED PROCESSING APPLICATION 

The RP-II protease variants of the present invention may also be used in the 
processing of feed for cattle, poultry, and pigs and especially for pet food. 
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TREATMENT OF HIDES 

The RP-fl protease variants of the invention may also be used for the treatment 
of hides. 

5 

MATERIALS AND METHODS 
Strains: 

B. subtiiis DN1885: Disclosed in WO 01/16285 

10 Plasmids: 

PNM1003: Disclosed in WO 01/16285 
pSX222: Disclosed in WO 96/34946 
pNM1008: See Example 2 

15 Method for producing a protease variant 

The present invention provides a method of producing an isolated enzyme ac- 
cording to the invention, wherein a suitable host cell, which has been transformed with 
a DNA sequence encoding the enzyme, is cultured under conditions permitting the pro- 
duction of the enzyme, and the resulting enzyme is recovered from the culture. 
20 When an expression vector comprising a DNA sequence encoding the enzyme 

is transformed into a heterologous host cell it is possible to enable heterologous re- 
combinant production of the enzyme of the invention. Thereby it is possible to make a 
highly purified RP-II protease composition, characterized in being free from homolo- 
gous impurities. 

25 The medium used to culture the transformed host cells may be any conventional 

medium suitable for growing the host cells in question. The expressed RP-II protease 
may conveniently be secreted into the culture medium and may be recovered there- 
from by well-known procedures including separating the cells from the medium by cen- 
trifugation or filtration, precipitating proteinaceous components of the medium by 

so means of a salt such as ammonium sulfate, followed by chromatographic procedures 
such as ion exchange chromatography, affinity chromatography, or the like. 
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Proteolytic Activity 

Enzyme activity can be measured using the PNA assay using succinyl-alanine- 
alanine-proline-glutamicacid-paranitroaniline as a substrate. The principle of the PNA 
assay is described in the Journal of American Oil C hemists S ociety, Rothgeb, T.M., 
5 Goodlander, B.D., Garrison, P.H., and Smith, LA., (1988). 

Textiles 

Standard textile pieces are obtained from EMPA St. Gallen, Lerchfeldstrasse 5, 
CH-9014 St. Gallen, Switzerland. Especially type EMPA 116 (cotton textile stained with 
10 blood, milk and ink) and EMPA 117 (polyester/cotton textile stained with blood, milk 
and ink). 

EXAMPLE 1 

Modelling RP-II proteases from the 3D structure of BLC 

15 The overall homology of Bacillus licheniformis protease BCL to other RP-II pro- 

teases is high. The similarity between the different RP-II proteases is provided in Table 
1 . Using the sequence alignment of Fig. 2 a model of the JA96 protease can be build 
using a suitable modelling tool like the Accellrys software Homology , or Modeller (also 
from Accellrys), or other software like Nest. These programs provide results as a first 

20 rough model, with some optimization in the Modeller and Nest programs. 

The first rough model provides a close structural homology between the model 
of JA96 protease and the 3D structure of the BCL as there are no overlapping side 
chains in the model structure. To optimize the structure the protein can in silico be 
soaked in a box of water and subjected to energy minimization and further molecular 

25 dynamics simulations using e.g. the CHARMm™ software from Accelrys. The in silico 
soaking in water can conveniently be done by adding water in the Insight II program 
(from Accelrys) with a box size of 75*75*75A3. The energy minimization can be done 
using settings of 300 Steepest descent (SD) and further 600 Conjugated gradients 
(CJ). The molecular dynamics simulations can conveniently be done using 1.2 ns run 

30 using the Verlet algorithm at 300K and standard parameters (see CHARMm manual). 
Other RP-II protease 3D models may be built in an analogous way. 
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EXAMPLE 2 

Construction of library of RP-II protease variants 
Construction and expression of BLC 

A S. suhtilis - £. coli shuttle vector, pNM1003, suited to a gene coding for RP-II 
5 protease BLC and its mutants was constructed. It is derived from the B. subtilis expres- 
sion vector pSX222 (Described in WO 96/34946) as described in WO 01/16285. To fa- 
cilitate cloning pNM1008 was constructed introducing a kpnl restriction site downstream 
the Hindlll site to facilitate the cloning of fragments inside the vector For transformation 
in Bacillus pNM1008 was restricted with Hindlll and a 4350 bp DNA fragment was iso- 
10 lated and ligated. The ligation mixture was used to transform competent B. subtilis 
DN1885, selecting for protease activity, as described in WO 01/16285. 

Site-directed mutagenesis 

BLC site-directed variants of the invention comprising specific substitutions, in- 
15 sertions or deletions in the molecule were made by traditional cloning of PGR frag- 
ments (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold 
Spring Harbor) produced by oligonucleotides containing the desired modification. As 
template pNM1008 was used. In a first PGR using a mutational primer (anti-sense) with 
a suitable opposite sense primer (e. g.. 5'-CTGTGCCCTTTAACCGCACAGC (SEQ ID 
20 No. 17)), downstream of the Mlul site was used. The resulting DNA fragment was used 
as a sense primer in a second PGR together with a suitable anti-sense primer (e. g. 5 r - 
G C AT AAG CTTTT ACAG GTAC C G G C (SEQ ID No. 18)) upstream from the Kpnl diges- 
tion site. This resulting PCR product was digested with Kpnl and Mlul and ligated in 
pNM1008 digested with the respective enzymes. 
25 The ligation reaction was transformed into E. coli by well-known techniques and 

5 randomly chosen colonies were sequenced to confirm the designed mutations. 

In order to express a BLC variant of the invention, the pNM1008 derived plasmid 
comprising the variant was digested with Hindlll, ligated and transformed into a compe- 
tent B. subtilis strain, selecting for protease activity. 
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EXAMPLE 3 

Purification of Enzymes and Variants: 

This procedure relates to purification of 2 liter scale fermentation for the 

52 



10517.000-DK 

production of the RP-II proteases of the invention in a Bacillus host cell. 

Approximately 1 .6 liters of fermentation broth are centrifuged at 5000 rpm for 35 
minutes in 1 liter beakers. The supernatahts are adjusted to pH 7 using 10% acetic acid 
and filtered through a Seitz Supra S100 filter plate. 

At room temperature, the filtrate is applied to a 100 ml Bacitracin affinity column 
equilibrated with 0 .01 M d imethylglutaric acid, 0.1 M boric acid and 0.002 M calcium 
chloride adjusted to pH 7 with sodium hydroxide (Buffer A). After washing the column 
with Buffer A to remove u nbound protein, the protease is e luted from the Bacitracin 
column using Buffer A supplemented with 25% 2-propanol and 1 M sodium chloride. 

The fractions with protease activity from the Bacitracin purification step are 
combined and applied to a 750 ml Sephadex G25 column (5 cm dia.) equilibrated with 
Buffer A. 

Fractions with proteolytic activity from the Sephadex G25 column are combined 
and the pH was adjusted to pH 6 with 10% acetic acid and applied to a 150 ml CM 
Sepharose CL 6B cation exchange column (5 cm dia.) equilibrated with a buffer 
containing 0.01 M dimethylglutaric acid, 0.1 M boric acid, and 0.002 M calcium chloride 
adjusted to pH 6 with sodium hydroxide. 

The protease is eluted using a linear gradient of 0-0.2 M sodium chloride in 2 
liters of the same buffer. 

Finally, the protease containing fractions from the CM Sepharose column are 
combined and filtered through a 0.2j4. filter. 

By using the techniques of Example 2 for the construction of variants and 
fermentation, and the above isolation procedure the following RP-II proteases and 
variants thereof may be produced and isolated: 



EXAMPLE 4 

Wash performance of detergent compositions comprising modified enzymes 
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AMSA 

The enzyme variants of the present application are tested using the Automatic 
Mechanical Stress Assay (AMSA). With the AMSA test the wash performance of a 
large quantity of small volume enzyme-detergent solutions can be examined. The 
AMSA plate has a number of slots for test solutions and a lid firmly squeezing the 
textile swatch to be washed against all the slot openings. During the washing time, the 
plate, test solutions, textile and lid are vigorously shaken to bring the test solution in 
contact with the textile and apply mechanical stress. For further description see WO 
02/42740 especially the paragraph "Special method embodiments" at page 23-24. 

The assay is conducted under the experimental conditions specified below: 



Detergent base 


Omo Acao 


Detergent dosage 


1.5 g/l 


Test solution volume 


160 micro I 


PH 


10-10.5 adjusted with NaHC0 3 


Wash time 


12 minutes 


Temperature 


20°C 


Water hardness 


9°dH 


Enzyme concentration in test solution 


5nM, 10 nM and 30 nM 


Test material 


EM PA 117 



After washing the textile pieces are flushed in tap water and air-dried. 
The performance of the enzyme variant is measured as the brightness of the colour of 
the textile samples washed with that specific enzyme variant. Brightness can also be 
expressed as the intensity of the light reflected from the textile sample when luminated 
with white light. When the textile is stained the intensity of the reflected light is lower, 
than that of a clean textile. Therefore the intensity of the reflected light can be used to 
measure wash performance of an enzyme variant 

Colour measurements are made with a professional flatbed scanner (PFU 
DL2400pro), which is used to capture an image of the washed textile samples. The 
scans are made with a resolution of 200 dpi and with an output colour dept of 24 bits. In 
order to get accurate results, the scanner is frequently calibrated with a Kodak 
reflective IT8 target 
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To extract a value for the light intensity from the scanned images, a special 
designed software application is used (Novozymes Color Vector Analyzer). The 
program retrieves the 24 bit pixel values from the image and converts them into values 
for red, green and blue (RGB). The intensity value (Int) is calculated by adding the RGB 
values together as vectors and then taking the length of the resulting vector: 



lnt=sjr 2 +g 2 +b 2 

The wash performance (P) of the variants is calculated in accordance with the 
10 below formula: 

P = lnt(v)-lnt(r) 

where 

lnt(v) is the light intensity value of textile surface washed with enzyme variant and 
lnt{r) is the light intensity value of textile surface washed with the reference enzyme, 
15 e.g. the parent RP-II protease, BLC or subtilisin 309 (BLSAVI) . 

The result of the AMSA wash of Hybrid IV is a Performance Score of S (n) in 
accordance with the definition: 

Performance Scores (S) sums the performances (P) of the tested enzyme 
variants as: 

20 S (2) which indicates that t he v ariant p erforms better than the reference at all three 
concentrations (5 f 1 0 and 30 nM) and 

S (1) which indicates that the variant performs better than the reference at one or two 
concentrations. 

25 Mini wash assay 

The millilitre scale wash performance assay is conducted under the following 
conditions: 
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Detergent base 


Omo Acao detergent powder 1 


Detergent dose 


1 .5 g/l 


pH 


"as is" in the current detergent solution and Is not ad- 
justed. 


Wash time 


14 min. 


Temperature 


20°C 


Water hardness 


9°dH, adjusted by adding CaCI 2 2H2O; MgCb 6H2O; 
NaHC0 3 (Ca 2+ :Mg 2+ :HC0 3 * = 2:1 :6) to milli-Q water. 


Enzymes 


To be tested/reference 


Enzyme cone. 


5 nM t 10 nM 


i esi system 


4 0*\ ml nlacc h»»W<=»r<; Tc*Ytil£* HiririAH in fp<5t ^nlutinn 

I £JO mi yJdbo Ufc?eJI\t2lo. 1 t?AUlo (JippcSU 111 itJoL ouiuuuii. 

Continuously up and down, 50 times per minute j 


Textile/volume 


1 textile piece (13x3 cm) in 50 ml test solution 


Test material 


EMPA 117 textile swatches 



After wash the measurement of remission from the test material is done at 460 
nm using a Zeiss MCS 521 VIS spectrophotometer The measurements are made ac- 
cording to the manufacturer's protocol. 

As shown in Table 1 the textile washed with the RP-II variant at 20°C in Omo 
Acao has a ???? remission than the textile washed with the parent. This result indi- 
cates that this variant has ???? wash performance at low temperature than the parent 
BLCX 



Table 1. Wash performance results of the RP-II protease variant in Omo Acao for a 
dosage of 5 nM and 10 nM enzyme. 



Enzyme 


Remission, 5 nM enzyme 


Remission, 10 nM enzyme 


Blank (no enzyme) 






BLC 
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6678 


CA 


GLN 


B 


222 


-5.345 


-6.085 


47 


.610 


1.00 


13 .48 


C 


40 


ATOM 


6680 


CB 


BGLN 


B 


222 


-5 .070 


-7.420 


46 


.900 


0.35 


14.51 


C 




ATOM 


6681 


CB 


AGLN 


B 


222 


-5 .003 


-7 .403 


46 


.863 


0.65 


14 .16 


c 




ATOM 


6686 


CG 


BGLN 


B 


222 


-3.617 


-7.830 


46 


.798 


0.35 


16.11 


c 




ATOM 


6687 


CG 


AGLN 


B 


222 


-6.230 


-8.072 


46 


.189 


0.65 


12 .89 


c 




ATOM 


6692 


CD 


BGLN 


B 


222 


-3 .455 


-9.200 


46 


.202 


0.35 


17.67 


c 


45 


ATOM 


6693 


CD 


AGLN 


B 


222 


-5.908 


-9.289 


45 


.310 


0.65 


14.84 


c 




ATOM 


6694 


OE1BGLN 


B 


222 


-4,040 


-10.165 


46 


.695 


0.35 


19.06 


0 




ATOM 


6695 


OE1AGLN 


B 


222 


-4.806 


-9.840 


45 


.371 


0.65 


18.23 


0 




ATOM 


6696 


NE2BGLN 


B 


222 


-2.655 


-9.300 


45 


.148 


0.35 


18 .44 


N 




ATOM 


6697 


NE2AGLN 


B 


222 


-6.880 


-9.712 


44 


.4 95 


0.65 


13 .26 


N 


50 


ATOM 


6702 


C 


GLN 


B 


222 


-4.109 


-5.562 


48 


.352 


1,00 


14 .27 


C 




ATOM 


6703 


O 


GLN 


B 


222 


-3.636 


-6.231 


49 


.284 


1.00 


17.27 


o 




ATOM 


6704 


OXT GLN 


B 


222 


-3.579 


-4 .486 


48 


.029 


1.00 


15.01 


o 




ATOM 


6705 


CA 


CA 


B 


301 


-0,643 


21.256 


17 


.293 


1.00 


10.41 


CA 




ATOM 


13398 


N 


ASP 


F 


401 


-10 . 088 


3 .418 


14 


.402 


1, 00 


20 .15 


N 


55 


ATOM 


13400 


CA 


ASP 


F 


401 


-10.419 


4 .298 


15 


.551 


1.00 


19.20 


C 




ATOM 


13402 


CB 


ASP 


F 


401 


-11.005 


3 .471 


16 


.700 


1.00 


20 .61 


C 




ATOM 


13405 


CG 


ASP 


F 


401 


-12.475 


3 .140 


16 


.497 


1.00 


22 .97 


c 




ATOM 


13406 


OD1 ASP 


F 


401 


-13.045 


2.395 


17 


.327 


1.00 


26.18 


o 




ATOM 


13407 


OD2 ASP 


F 


401 


-13.144 


3 .572 


15 


.537 


1.00 


25.29 


o 


60 


ATOM 


13408 


C 


ASP 


F 


401 


-9.196 


5 .076 


16 


.021 


1.00 


16.65 


C 




ATOM 


13409 


O 


ASP 


F 


401 


-9.239 


5.713 


17 


,069 


1.00 


16.48 


0 




ATOM 


13412 


N 


ALA 


F 


402 


-8.115 


5.032 


15 


.242 


1-00 


14.63 


N 




ATOM 


13414 


CA 


ALA 


F 


402 


-6.897 


5.780 


15 


.549 


1.00 


12.75 


C 




ATOM 


13416 


CB 


ALA 


F 


402 


-7.112 


7 .245 


15 


.277 


1.00 


12.61 


c 
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10 



15 



20 



ATOM 


13420 


C 


ALA 


F 


402 


-6 . 


485 


5 


. 557 


16. 


999 


1 


.00 


11 


.06 


ATOM 


13421 


O 


ALA 


F 


402 


-6 . 


190 


6 


.500 


17. 


738 


1 


.00 


10 


.74 


ATOM 


13422 


N 


PHE 


F 


403 


-6. 


464 


4 


.296 


17. 


429 


1 


.00 


10 


.84 


ATOM 


13424 


CA 


PHE 


F 


403 


-6. 


076 


4 


.009 


18. 


798 


1 


.00 


10 


.34 


ATOM 


13426 


CB 


PHE 


F 


403 


-6. 


233 


2 


.517 


19. 


116 


1 


.00 


11 


.44 


ATOM 


13429 


CG 


PHE 


F 


403 


-7 . 


671 


2 


. 025 


19 . 


183 


1 


.00 


12 


.38 


ATOM 


13430 


CD1 


PHE 


F 


403 


-8. 


562 


2 


.511 


20 . 


119 


1 


.00 


14 


.77 


ATOM 


13432 


CE1 


PHE 


F 


403 


-9, 


880 


2 


.048 


20 . 


187 


1 


.00 


17 


.09 


ATOM 


13434 


CZ 


PHE 


F 


403 


-10 . 


309 


1 


-064 


19 . 


322 


1 


. 00 


18 


.48 


ATOM 


13436 


CE2 


PHE 


F 


403 


-9. 


424 


0 


. 544 


18 . 


386 


1 


. 00 


18 


. 39 


ATOM 


13438 


CD2 


PHE 


F 


403 


-8. 


109 


1 


. 018 


18 . 


324 


1 


.00 


16 


.19 


ATOM 


13440 


C 


PHE 


F 


403 


-4 . 


626 


4 


.428 


19. 


018 


1 


.00 


10 


.00 


ATOM 


13441 


O 


PHE 


F 


403 


-3 . 


748 


4 


.110 


18 . 


209 


1 


.00 


12 


.25 


ATOM 


13442 


N 


GLU 


F 


404 


-4 . 


372 


5 


.130 


20. 


116 


1 


.00 


8 


.64 


ATOM 


13444 


CA 


GLU 


F 


404 


-3 . 


025 


5 


.588 


20. 


427 


1 


.00 


8 


.12 


ATOM 


13446 


CB 


GLU 


F 


404 


-2 . 


992 


7 


.120 


20. 


524 


1 


.00 


7 


. 95 


ATOM 


13449 


CG 


GLU 


F 


404 


-3. 


122 


7 


.705 


19. 


117 


1 


.00 


8 


.08 


ATOM 


13452 


CD 


GLU 


F 


404 


-3. 


043 


9 


.212 


19. 


009 


1 


.00 


7 


.71 


ATOM 


13453 


OE1 


GLU 


F 


404 


-3. 


129 


9 


.917 


20. 


027 


1 


.00 


8 


.61 


ATOM 


13454 


OE2 


GLU 


F 


404 


-2. 


901 


9 


.672 


17. 


856 


1 


.00 


8 


. 80 


ATOM 


13455 


C 


GLU 


F 


404 


-2 . 


442 


4 


. 854 


21. 


63 7 


1 


.00 


8 


.22 


ATOM 


13456 


O 


GLU 


F 


404 


-2 . 


865 


3 


.708 


21. 


892 


1 


.00 


8 


.82 


ATOM 


13457 


OXT 


GLU 


F 


4 04 


-1. 


513 


5 


.394 


22. 


258 


1 


. 00 


8 


.53 



c 

O 
N 

c 
c 
c 
c 
c 
c 
c 
c 
c 
o 

N 

c 
c 
c 
c 
o 
o 
c 

O 
O 



25 



84 



10517.000-DK 



PATENT CLAIMS 

1. A method for constructing a RP-II protease variant, wherein the variant has at 
least one altered property as compared to a parent RP-II protease, which method com- 

5 prises: 

a) analyzing the three-dimensional structure of the RP-II protease to identify, on the 
basis of an evaluation of structural considerations, at least one amino acid resi- 
due or at least one structural region of the RP-li protease, which is of relevance 
for altering said property; 
10 b) modifying the DNA of the polynucleotide encoding the parent to construct a 
polynucleotide encoding a variant RP-II protease, which in comparison to the 
parent RP-II protease, has been modified by deletion, substitution or insertion of 
the amino acid residue or structural part identified in i) so as to alter said prop- 
erty; 

15 c) expressing the variant RP-II protease in a suitable host, and 
d) testing the resulting RP-II protease variant for said property. 

2. A method of producing a BLC like RP-II protease variant, wherein the variant 
has a 1 1 east o ne a Itered p roperty a s c ompared to a parent B LC I ike RP-II p rotease, 

20 which method comprises: 

a) producing a model structure of the parent BLC like RP-II protease on the three- 
dimensional structure of BLC, 

b) comparing the model three-dimensional structure of the parent BLC like RP-II 
protease to the BLC structure by superimposing the structures through matching 

25 the CA, CB, C, O, and N atoms of the active site residues, 

c) identifying on the basis of the comparison in step a) at least one structural part 
of the parent BLC like RP-II protease, wherein an alteration in said structural part 
is predicted to result in an altered property; 

d) modifying the nucleic acid sequence encoding the parent BLC like RP-II prote- 
30 ase to produce a nucleic acid sequence encoding at least one deletion or substi- 
tution of one or more amino acids at a position corresponding to said structural 
part, or at least one insertion of one or more amino acid residues in positions 
corresponding to said structural part; 

e) performing steps c) and d) iteratively N times, where N is an integer with the 
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value of one or more; 

f) preparing the variant resulting from steps a) - e); 

g) testing the stability of said variant; and 

h) optionally repeating steps a) - g) recursively; and 

5 i) selecting a RP-II protease variant having at least one altered property as com- 
pared to the parent RP-II protease, 
j) expressing the modified nucleic acid sequence in a host cell to produce the vari- 
ant RP-II protease; 
k) isolating the produced protease; 
10 I) purifying the isolated protease and 

m) recovering the purified RP-II protease variant. 

3. The method of claim 2, wherein step (c) identifies amino acid residue positions 
located at a distance of 10A or less to the ion-binding site of the RP-II protease parent, 

15 preferably positions located at a distance of 6 A or less. 

4. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
the RP-II protease parent, the modification of which provides for the removal of the ion 
binding site by modification of at least one of the positions identified. 

20 

5. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
highly mobile regions of the RP-II protease parent. 

7. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
25 mobile regions of the RP-II protease parent. 

8. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
the parent RP-II protease, the modification of which may create at least one disulfide 
bridge by insertion of or substitution with at lease one Cys residue. 

30 

9. The method of claim 2, wherein steps (c) and (d) provide for constructing a vari- 
ant of a parent RP-il protease having a modified surface charge distribution by: 

c') identifying, on the surface of the parent RP-II protease, at least one charged 
amino acid residue; 

35 d') modifying the charged residue identified in step (a) through deletion or substitu- 

86 



10517.000-DK 



tion with an uncharged amino acid residue; 

10. The method of claim 2, wherein steps (c) and (d) provide for constructing a vari- 
ant of a parent RP-II protease having a modified surface charge distribution by: 

c") identifying, on the surface of the parent RP-II protease, at least one position be- 
ing occupied by an uncharged amino acid residue; 

d") modifying the charge in that position by substituting the uncharged amino acid 
residue with a charged amino acid residue or by insertion of a charged amino 
acid residue at the position. 

1 1 . The method of claim 2, wherein steps (c) and (d) provide for constructing a vari- 
ant of a parent RP-II protease having a modified surface charge distribution by: 

c"') identifying, on the surface of the parent RP-II protease, at least one charged 
amino acid residue; 

d m }substituting the charged amino acid residue identified in step (a) with an amino 
acid residue having an opposite charge. 

12. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
the parent RP-II protease, the modification of which to Pro may create a RP-II protease 
variant exhibiting improved stability. 

13. The method of claim 2, wherein step (c) identifies amino acid residue positions in 
the parent RP-II protease at a distance of less than 10A from the active site residues. 

14. The method of one or more of claims 2 to 13, wherein N in step (e) is an integer 
between 1 and 50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2. 

15. A RP-II protease variant comprising at least one modification in an amino acid 
residue in a position located at a distance of 10A or less to the ion-binding site, pref- 
erably positions located at a distance of 6 A or less. 

16. The variant of claim 15, wherein modifications are made in at least one of the 
positions: 1, 2, 3, 4, 5, 6, 7, 8, 143, 144, 145, 146, 158, 159, 160, 161, 162, 194, 199, 
200, and 201, preferably positions 2, 3, 4, 5, 6, 7, 144, 159, 160, and 161, and espe- 
cially the modifications D7E and D7Q in BLC (SEQ ID NO: 2), where the positions refer 
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to BLC or corresponding positions. 

17. The variant of claims 15 or 16, wherein the modification comprises the substitu- 
tion o f a p ositively c harged a mino a cid r esidue w ith a n eutral o r n egatively c harged 
residue, or the substitution of a neutral residue with a negatively charged residue or the 
deletion of a positively charged or neutral residue. 

18. The variant of claim 15, wherein the ion binding site is removed by modification 
in at least one of the positions corresponding to positions 144 and or 161 of BLC, es- 
pecially the modifications H144R and/or D161R,K+H144Q,N in BLC (SEQ ID IMO:2). 

19. A RP-II protease variant comprising at least one modification in an amino acid 
residue in highly mobile regions in at least one of the positions corresponding to posi- 
tions 26-31 (26, 27, 28, 29, 30, and 31); 89-91 (89, 90, and 91); 216-221 (216, 217, 
218, 219, 220, and 221) of BLC. 

20. The variant of claim 19, wherein the parent is BLC and the modification com- 
prises G30A and/or G91 A. 

21. A RP-II protease variant comprising at least one modification made in mobile re- 
gions in at least one of the positions corresponding to positions 51-56, (51, 52, 53, 54, 
55, 56), 88-94, (88, 89, 90, 91, 92, 93, 94), 118-122 (118, 119, 120, 121, 122), and 
173-183 (173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183)of BLC, preferably the 
regions 51-56 and 118-122. 

22. A RP-II protease variant having at least one disulfide bridge provided by modify- 
ing the amino acid residues in positions 128 and 145 in BLC or corresponding positions 
to Cys, preferably the substitutions S145C and T128C in BLC or corresponding posi- 
tions. 

23. A RP-II protease variant having a modified surface charge distribution in com- 
parison to the parent RP-II protease comprising modifications in at least one of the po- 
sitions corresponding to positions 7, 17, 95, 109, 143, 174, 209, 216, of BLC, especially 
the modifications 
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D7N, S, T 

Y17R, K, H 

Y95R, K, H 

T109R, K, H 

Q143R, K, H 

Q174R, K, H 

E209Q, N 

N216R, K, H 

in BLC (SEQ ID NO. 1) 

24. A RP-II protease variant exhibiting improved stability in comparison to the parent 
RP-II protease comprising at substitution to Pro in at least one of the positions corre- 
sponding to positions 18, 115, 185, 269 and 293 in BLC, especially one or more of the 
substitutions: T60P, S221P, G193P, V194P in BLC (SEQ ID NO. 1). 

25. A RP-II protease variant comprising modifications in amino acid residues in posi- 
tions corresponding to positions 1, 8, 22-35 (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34, 35), 42-58 (42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58), 82- 
100 (82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100), 129-135 
(129, 130, 131,132, 133, 134, 135), 141-142, 153-156 (153, 154, 155, 156), 158, 161- 
171 (161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171), 188-193 (188, 189, 190, 
191, 192, 193), 195,, 201-207 (201, 202, 203, 204, 205, 206, 207), 210, 213-214, 217 
in BLC at a distance of less than 10A from the active site residues. 

26. The RP-II protease variant of any of the claims 15 to 25, further comprising at 
least one of the modifications (i) amino acid residues in positions that form part of an 
Asn-Gly sequence being modified by deletion or substitution, preferably with Asp, Gin, 
Ser, Pro, Thr, or Tyr; (ii) amino acid residues in positions that occupied by a Trp being 
modified by substitution with Phe, Thr, Gin or Gly; (iii) amino acid residues in positions 
that are occupied by Glu or Asp being modified by substitution with Ala; (iv) amino acid 
residues in positions that are in positions that are the 1 st or 2 nd position following a posi- 
tion occupied by a Glu or Asp residue being modified by substitution with a Pro; or (v) 
amino acid residues in positions that are occupied by a Met being modified by deletion 
or substitution, preferably with Ser or Ala. 
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27. The RP-II protease of any of claims 1 5 to 26 that is modified in a number of posi- 
tions ranging from at least one and up to 50 positions, or from 1 to 45 positions, or from 
1 to 40 positions, or from 1 to 35 positions, or from 1 to 30 positions, or from 1 to 25 
positions, or from 1 to 20 positions, or from 1 to 15 positions, or from 1 to 14 positions, 
or from 1 to 13 positions, or from 1 to 12 positions, or from 1 to 11 positions, or from 1 
to 10 positions, or from 1 to 9 positions, or from 1 to 8 positions, or from 1 to 7 posi- 
tions, or from 1 to 6 positions, or from 1 to 5 positions, or from 1 to 4 positions, or from 
1 to 3 positions, or from 1 to 2 positions, such modifications comprising substitutions, 
deletions, insertions and combinations thereof in the indicated number of positions. 

28. An isolated polynucleotide comprising a nucleic acid sequence, which encodes 
for a RP-fl protease variant defined or produced in any of the preceding claims. 

29. The polynucleotide of claim 28, wherein the nucleic acid sequence has at least 
50%, at least 55%, at least 60%, at least 65%, at least 70%. at least 75%, at ieast 80%, 
at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at ieast 94%, at 
ieast 95%, at least 96%, at least 97%, at least 98%, or at least 99% homology with the 
nucleic acid sequence shown in SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, or SEQ ID NO:15. 

30. An isolated nucleic acid construct comprising a nucleic acid sequence as de- 
fined in any of claims 28-29, operably linked to one or more control sequences capable 
of directing the expression of the polypeptide in a suitable expression host. 

31 . A recombinant host cell comprising the nucleic acid construct of claim 30. 

32. A method for producing the RP-II variant defined or produced in any of claims 1 
to 27 the method comprising: 

a) cultivating the recombinant host cell of claim 31 under conditions conducive to 
the production of the RP-II protease variant; and 

b) recovering the variant. 

33. A detergent composition comprising a RP-II protease variant defined or pro- 
duced in any of claims 1 to 27. 



34. Use of a RP-II protease variant defined or produced in any of claims 1 to 27 for 
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washing or cleaning purposes. 

35. Use of a RP-II protease variant defined or produced in any of claims 1 to 27 for 
processing food. 

5 

36. Use of a RP-II protease variant defined or produced in any of claims 1 to 27 for 
processing feed. 

37. Use of a RP-II protease variant defined or produced in any of claims 1 to 27 for 
10 the treatment of hides. 
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ABSTRACT 

The present invention relates to methods for producing variants of a parent RP-II pro- 
tease and the variants having altered properties as compared to the parent RP-II pro- 
tease. 
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1/3 Patent- op 

9 Varemaerkestyrelsen 

1 3 FEB, 2004 
Modtaget 



[N-term] [] [ 2 ] [ 3 ] [4] 

+ + 9 22 26 31 36 41 44 

BLC 1 SVIGSDDRTRVTNTTAYPYRAI VHI S SS IGSCTGWMI GPKTVATA 45 



O U [7] [8 ] { } 

* 50 56 62 65 77 83 86 90 

BLC 4 6 GHC I YDTS S GS PAGTATVS PGRNGTS YPYGS VKS TRYF I P SGWRS 90 



[ 9] { } [] [ 11 ] 

* 99 102 106 110 114 126 131 

BLC 91 GNTNYD YGAI ELS EP I GNTVGYFGYS YTTS SLVGTTVT I S G Y PGD 135 



[ 12 ] [13] [ 14 ] 

142 151 156 + * 171 177 

BLC 136 KTAGTQWQHSGP I AI SETYKLQYAMDTYGGQSGS PVFEQS S SRTN 180 



C 15 ] C 16] { } 

182 192 201 208 219 

BLC 181 C SGP C S LAVHTNGVYGGS S YNRGTR I TKEVFDNLTNWKNS AQ 222 



* Active site residue (47, 96, 167) 
+ Calcium coordination residue (3, 5, 161) 
Q Short strand (9-10, 50-51, 56-57, 114-115) 
[] Long strand (22-26, 31-36, 41-44, 62-65, 77-83, 99-102, 
126-131, 142-151, 156-159, 171-177, 182-192,201-205) 
{ } Helix (86-90, 1 06-1 1 0, 208-21 9) 



Fig. 1 



BLC 

CDJ31 

AC116 

MIP 

JA96 

B032 

MPR 
AA513 



Patent- og 

Varem aerkestyre Isen 

2/3 13 FEB. 2004 

Modtaget 

. SVIGSDDRTRVTNTTAYPYRAIVHISSS******IGSCTGWMIGPKTVA 43 
- SVIGSDERTRVTNTTAYPYRAIVHISSS* * * * **IGSCTGSLIGPKTVA 
. SVIGSDERTRVTDTTAPPYRAIVHISSS * ***** IGSCTGWLIGPKTVA 

. WIGDDGRT KVANTRVAP YNS I AYI T FG * * * * * *GSSCTGTLIAPNKIL 
. WIGDDGRTKVTNTRVAPYNSIAYITFG* * * * * *GSSCTGTLIAPNKIL 
. WIGDDGRT KVANTRVAP YNS I A YTTFG* * * * * *GSSCTGTLIAPNKIL 

abode f . 

S I IGTDERTRI SSTTSFPYRATVQLS I KYPNTS ST YGCTGFLVNPNTW 
. WI GDDGRRQVQNTS FMPFRALT Y I E FG * * NLTS TWS CSGGVI GTDLW 



BLC TAGHCIYDTSSGSFAGTATVSPGRNGTSYPYGSVKSTRYFIPSGWR*SGN 92 

CD J3 1 TAGHCI YDTASGS FAGTATVS PGRNGSTYPYGSVTSTRYFI PSGYR* SGN 

AC1 1 6 TAGHC VYDTASRS FAGTATVS PGRNGS AYP YGS VTS TRYF I PSGWQ * SGN 

a. 

MI P TNGHCVYNTASRSYSAKGSVYPGMNDSTAVNGSANMTEFYVPSGYINTGA 

JA96 TNGHCVYNTATRSYSAKGSVYPGMNDSTAVNGSANMTEFYVPSGYINTGA 

B03 2 TNGHCVYNTASRS YSAKGSVYPGMNDS TAVNGS ANMTEF YVP S GY INTGA 

MPR TAGHC VY* SQDHGWASTITAAPGRNGSSYPYGTYSGTMFYSVKGWTESKD 

AA513 TNAHCV** * * * EGSVLAGT WPGMNNSQWAYGH YRVTQI I Y PDQYRNNGA 

B ^C TNYDYGAIELS * * * * * EPIGNTVGYFGY S YT * TS SLVGTTVT I SGYPGDK 13 6 

CD J3 1 SNYD YGAI ELS * * * * * QP IGNTVGYFGYSYT* TS SLVGS S VTI IGYPGDK 

AC116 SNYDYAAIELS * * * * *QP IGNTVGYFGYSYT * AS SLAGAGVT I SGYPGDK 

• ... 
SQYDFAVIKTD* ****TNIGNTVGYRSIRQ* *VTNLTGTT I KI SGYPGDK 
JA96 SQYDFAVIKTD*** * * TNI GNTVG YRS I RQ * * VTNLTGTT I KI SG YPGD K 

BQ3 2 SQYDFAVIKTD* *** * TNI GNTVG YRS IRQ* *VTNLTGTT I KI SGYPGDK 

abcde . a . 

MPR TNYDYGAIKLN* * * * *GSPGNTVGWYGYRTTNSSSPVGLSSSVTGFPCDK 

AA5 1 3 S EFD YAI LRVAPDSDGRH I GNRAGILS FTETGTVN * ENTFLRT YG Y PGDK 

B ^C T* * * * AGTQWQHSGP I AI SET * YKLQYAMDTYGGQSGS PVFEQSSSRTNC 181 

CDJ31 T****SGTQWQMSGNIAVSET*YKLQYAIDTYGGQSGSPVYEASSSRTNC 
AC116 T****TGTQWQMSGTIAVSET*YKLQYAIDTYGGQSGSPVYEKSSSRTNC 

abed . ... 
MIP MRSTGKVSQWEMSGSVTREDT*NLAYYTIDTFSGNSGSAMLDQ* ****** 

JA96 MRSTGKVSQWEMSGPVTREDT*NLAYYTIDTFSGNSGSAMLDQ* ****** 

B03 2 MRSTGKI SQWEMSGPVTREDT * NLAY YM I DTF SGNSGSAMLDQ * ****** 

abed . a . 

MPR T* * * * FGTMWSDTKPIRSAET*YKLTYTTDTYGCQSGSPVYRNYSD* * * * 

AA5 1 3 I SETKLI SLWGMVGRSDAFLHRDLLF YNMDTYFGQSGS PVLN* ******* 



BLC NGPC S LAVHTNG * * VYGGS S YNRGTR I TKE VFDNLTNWKNS AQ 222 

CD J3 1 SGPCSLAVHTNG* * VYGGS S YNRGTR I TKE VFDNLTNWKNS AQ 

AC116 SGP C S LAVHTNG * * V YGG S S YNRGTR I TKE VFDNFTS WKNS AQ 

MIP *NQQIVGVHNAG* * * YSNGTINGGPKATAAFVEFINYAKAQ* * 

JA9 6 *NQQ I VGVHNAG * * * YSNGT I NGGPKATAAF VE F I NYAKAQ * * 

B03 2 *NQQIVGVHNAG* ** YSNGTINGGPKATAAFVEFINYAKAQ* * 

ab 

MPR TGQTAIAIHTN* * ** * GGSS YNLGTRVTNDVFNNIQYWANQ* * 

AA5 13 SVDSMVAVHNAGYI VGGNRE INGGPKIRRDFTNLFNQMN* * * * . 
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10517 Patent- oo 

SEQUENCE LISTING Var ^aerfestyrelsen 

<110> Novozymes A/s ' ^ FEB, 2004 

Modtaget 

<120> Protease variants 

<130> 10517. 000-DK 

<140> 
<141> 

<160> 18 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 948 
<212> DNA 

<213> Bacillus licheniformis 

<220> 

<221> CDS 

<222> CD..C948) 

<220> 

<221> mat_peptide 
<222> C283)..(948) 

<220> 

<221> sig_peptide 
<222> C1)..C90) 

<223> propeptide C914) ... (282) 
<400> 1 

ttg gtt agt aaa aag agt gtt aaa cga ggt ttg ate aca ggt etc att 48 
Leu val ser Lys Lys Ser Val Lys Arg Gly Leu lie Thr Gly Leu lie 
-90 -85 -80 

ggt att tct att tat tct tta ggt atg cac ccg gee caa gec gcg cca 
Gly lie Ser lie Tyr Ser Leu Gly Met His Pro Ala Gin Ala Ala Pro 
-75 -70 -65 



teg gtt act tat gac cca cac att aag age gat caa tac ggc ttg tat 
Ser val Thr Tyr Asp Pro His lie Lys ser Asp Gin Tyr Gly Leu Tyr 
-45 -40 -35 



96 



teg cct cat act cct gtt tea age gat cct tea tac aaa gcg gaa aca 144 
Ser pro His Thr pro val Ser ser Asp Pro Ser Tyr Lys Ala Glu Thr 
-60 -55 -50 



192 



tea aaa gcg ttt aca ggc acc ggc aaa gtg aat gaa aca aag gaa aaa 240 
Ser Lys Ala Phe Thr Gly Thr Gly Lys vat Asn Glu Thr Lys Glu Lys 
-30 ^25 -20 -15 

gcg gaa aaa aag tea ccc gec aaa get cct tac age att aaa teg gtg 288 
Ala Glu Lys Lys Ser Pro Ala Lys Ala Pro Tyr Ser lie Lys ser val 
-10 -5 -1 1 

att ggt tct gat gat egg aca agg gtc acc aac aca acc gca tat ccg 336 
He Gly ser Asp Asp Arg Thr Arg Val Thr Asn Thr Thr Ala Tyr pro 
5 10 15 

tac aga gcg ate gtt cat att tea age age ate ggt tea tgc acc gga 384 
Tyr Arg Ala lie Val His lie Ser Ser Ser He Gly ser Cys Thr Gly 
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20 25 30 

tgg atg ate ggt ccg aaa acc gtc gca aca gec gga cac tgc ate tat 432 
Trp Met lie Gly Pro Lys Thr Val Ala Thr Ala Gly His Cys lie Tyr 
35 40 45 50 

gac aca tea age ggt tea ttt gee ggt aca gee act gtt teg ccg gga 480 
Asp Thr ser ser Gly ser Phe Ala Gly Thr Ala Thr val ser Pro Gly 
55 60 65 

egg aac ggg aca age tat cct tac ggc tea gtt aaa teg acg cgc tac 528 
Arg Asn Gly Thr Ser Tyr Pro Tyr Gly Ser val Lys Ser Thr Arg Tyr 
70 75 80 

ttt att ccg tea gga tgg aga age gga aac acc aat tac gat tac gga 576 
Phe lie Pro Ser Gly Trp Arg ser Gly Asn Thr Asn Tyr Asp Tyr Gly 
85 90 95 

gca ate gaa eta age gaa ccg ate ggc aat act gtc gga tac ttc gga 624 
Ala He Glu Leu ser Glu Pro lie Gly Asn Thr Val Gly Tyr Phe Gly 
100 105 110 

tac teg tac act act tea tea ctt gtt ggg aca act gtt acc ate age 672 
Tyr ser Tyr Thr Thr Ser Ser Leu val Gly Thr Thr val Thr lie Ser 
115 120 125 130 



ggc tac cca ggc gat aaa aca gca ggc aca caa tgg cag cat tea gga 
Gly Tyr Pro Gly Asp Lys Thr Ala Gly Thr Gin Trp Gin His Ser Gly 



720 



ccg att gee ate tec gaa acg tat aaa ttg cag tac gca atg gac acg 768 

Pro lie Ala lie Ser Glu Thr Tyr Lys Leu Gin Tyr Ala Met Asp Thr 
150 155 160 

tac gga gga caa age ggt tea ccg gta ttc gaa caa age age tec aga 816 

Tyr Gly Gly Gin ser Gly ser Pro val Phe Glu Gin Ser ser ser Arg 
165 170 175 

acg aac tgt age ggt ccg tgc teg ctt gee gta cac aca aat gga gta 

Thr Asn Cys ser Gly Pro Cys ser Leu Ala Val His Thr Asn Gly Val 

180 185 190 



864 



tac ggc ggc tec teg tac" aac aga ggc acc egg att aca aaa gag gtg 912 
Tyr Gly Gly Ser Ser Tyr Asn Arg Gly Thr Arg lie Thr Lys Glu Val 
195 200 205 210 

ttc gac aat ttg acc aac tgg aaa aac age gca caa 948 
Phe Asp Asn Leu Thr Asn Trp Lys Asn ser Ala Gin 
215 220 

<210> 2 
<211> 316 
<212> PRT 

<213> Bacillus licheniformis 
<400> 2 

Leu val ser Lys Lys ser Val Lys Arg Gly Leu lie Thr Gly Leu lie 
-90 -85 -80 

Gly lie ser lie Tyr ser Leu Gly Met His Pro Ala Gin Ala Ala Pro 
-75 -70 -65 

ser Pro His Thr pro val ser ser Asp pro ser Tyr Lys Ala Glu Thr 
-60 -55 -50 

ser Val Thr Tyr Asp Pro His lie Lys Ser Asp Gin Tyr Gly Leu Tyr 
-45 -40 -35 
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Ser Lys Ala Phe Thr Gly Thr Gly Lys Val Asn Glu Thr Lys Glu Lys 
-30 -25 -20 -15 

Ala Glu Lys Lys ser Pro Ala Lys Ala Pro Tyr Ser lie Lys Ser Val 
-10 -5 -1 1 

lie Gly ser Asp Asp Arg Thr Arg val Thr Asn Thr Thr Ala Tyr Pro 
5 10 15 

Tyr Arg Ala lie val His lie Ser ser Ser lie Gly Ser cys Thr Gly 
20 25 30 

Trp Met lie Gly pro Lys Thr val Ala Thr Ala Gly His Cys lie Tyr 
35 40 45 50 

Asp Thr Ser Ser Gly Ser phe Ala Gly Thr Ala Thr Val ser Pro Gly 
55 60 65 

Arg Asn Gly Thr ser Tyr Pro Tyr Gly ser val Lys Ser Thr Arg Tyr 
70 75 80 

Phe lie Pro ser Gly Trp Arg Ser Gly Asn Thr Asn Tyr Asp Tyr Gly 
85 90 95 

Ala lie Glu Leu ser Glu Pro lie Gly Asn Thr Val Gly Tyr Phe Gly 
100 105 110 

Tyr ser Tyr Thr Thr Ser Ser Leu val Gly Thr Thr Val Thr lie ser 
115 120 125 130 

Gly Tyr pro Gly Asp Lys Thr Ala Gly Thr Gin Trp Gin His ser Gly 
135 140 145 

pro lie Ala lie ser Glu Thr Tyr Lys Leu Gin Tyr Ala Met Asp Thr 
150 155 160 

Tyr Gly Gly Gin Ser Gly Ser Pro Val Phe Glu Gin Ser ser Ser Arg 
165 170 175 

Thr Asn cys Ser Gly Pro cys Ser Leu Ala val His Thr Asn Gly val 
180 185 190 

Tyr Gly Gly Ser ser Tyr Asn Arg Gly Thr Arg lie Thr Lys Glu Val 
195 200 205 210 

Phe Asp Asn Leu Thr Asn Trp Lys Asn Ser Ala Gin 
215 220 



<210> 3 

<211> 1026 

<212> DNA 

<213> Bacillus halmapalus AA513 

<220> 

<221> CDS 

<222> (1)..(1026) 

<220> 

<221> mat ^peptide 
<222> (361). .(1026) 

<220> 

<221> sig_peptide 
<222> (I).. (78) 

<223> Pro-peptide (79) ... (360) 
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<400> 3 

atg aaa eta eta tta aaa ctt act ttt gta tgc ata ttt atg tta agt 48 
Met Lys Leu Leu Leu Lys Leu Thr Phe Val Cys lie Phe Met Leu ser 
-120 -115 -110 -105 

ggg att eta tec cca gta aac gca act caa get gag act ctt act aaa 96 
Gly He Leu Ser Pro val Asn Ala Thr Gin Ala Glu Thr Leu Thr Lys 
-100 -95 -90 

tta aat aaa ata agt cag aag cag gaa cca tea tat aaa eta gat gaa 144 
Leu Asn Lys lie Ser Gin Lys Gin Glu Pro ser Tyr Lys Leu Asp Glu 
-85 -80 -75 

gaa atg gat tat gtt eta att gat ttg gaa aca caa tct gaa teg att 192 
Glu Met Asp Tyr Val Leu lie Asp Leu Glu Thr Gin Ser Glu Ser lie 
-70 -65 -60 

att teg ata gga gat aat ace gat ttg gga gat caa teg ttt act tct 240 
lie Ser lie Gly Asp Asn Thr Asp Leu Gly Asp Gin ser Phe Thr ser 

-55 -50 -45 

tta ggg aag gtg gga cat gga gaa ctt gag aaa att aac tta gaa gaa 288 
Leu Gly Lys Vat Gly His Gly Glu Leu Glu Lys lie Asn Leu Glu Glu 
-40 -35 -30 -25 

ttt cgt aat cct aat tta aca gta gta gac ccg tta aca cgt aag cct 336 
phe Arg Asn Pro Asn Leu Thr val val Asp Pro Leu Thr Arg Lys Pro 
-20 -15 -10 

att gaa caa aaa ate age cct ttt gtt gtt ata ggc gat gat ggg aga 384 
lie Glu Gin Lys lie ser Pro Phe val val lie Gly Asp Asp Gly Arg 
-5 -11 5 

aga caa gtt caa aat act tct ttc atg cca ttt cgt gca ctt act tat 432 
Arg Gin val Gin Asn Thr ser Phe Met Pro Phe Arg Ala Leu Thr Tyr 
10 15 20 



att gag ttt gga aac ctt aca agt aca tgg agt tgt tct gga ggt gtg 

lie Glu Phe Gly Asn Leu Thr ser Thr Trp ser Cys Ser Gly Gly vaT 

25 30 35 40 

att gga aca gat tta gtt gtt act aat gca cat tgt gta gaa ggt tct 

lie Gly Thr Asp Leu val Val Thr Asn Ala His Cys Val Glu Gly Ser 

45 50 55 



aca gaa aca gga act gtt aac gaa aat act ttt eta aga acg tat gga 
Thr Glu Thr Gly Thr Val Asn Glu Asn Thr Phe Leu Arg Thr Tyr Gly 
125 130 135 

tac ccc ggt gat aaa ata tea gag aca aaa tta att tct ttg tgg gga 
Tyr Pro Gly Asp Lys lie ser Glu Thr Lys Leu lie Ser Leu Trp Gly 

Page 4 



480 



528 



gtg tta gca ggt act gta gtt cct ggt atg aac aat agt cag tgg gca 576 
val Leu Ala Gly Thr val val Pro Gly Met Asn Asn Ser Gin Trp Ala 
60 65 70 

tat ggg cat tat agg gtt act cag att ate tac cct gat caa tac aga 624 
Tyr Gly His Tyr Arg val Thr Gin lie He Tyr Pro Asp Gin Tyr Arg 
75 80 85 

aat aac ggt get tea gag ttt gat tat get ata ctt aga gta gca cct 672 
Asn Asn Gly Ala Ser Glu Phe Asp Tyr Ala lie Leu Arg val Ala Pro 
90 95 100 

gac tct gat gga cgt cat att gga aac aga get gga att tta tct ttt 720 
Asp ser Asp Gly Arg His lie Gly Asn Arg Ala Gly lie Leu ser Phe 
105 110 115 120 



768 



816 
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140 145 150 

atg gtt ggt cga tct gat gca ttt ttg cat cga gac eta ctg ttc tac 864 
Met val Gly Arg Ser Asp Ala Phe Leu His Arg Asp Leu Leu Phe Tyr 
155 ~ 160 165 

aat atg gac acc tat ttt ggt caa tea ggt tct cct gta tta aac age 912 
Asn Met Asp Thr Tyr Phe Gly Gin Ser Gly ser Pro val Leu Asn ser 
170 175 180 

gta gat tea atg gtt gcg gtt cat aat gca ggg tat ate gtt ggt ggt 960 
val Asp Ser Met val Ala val His Asn Ala Gly Tyr lie val Gly Gly 
185 190 195 200 

aat agg gaa att aat ggt ggt cct aaa ate aga aga gat ttt aca aac 1008 
Asn Arg Glu lie Asn Gly Gly Pro Lys lie Arg Arg Asp Phe Thr Asn 
205 210 215 

tta ttt aat caa atg aac 1026 
Leu Phe Asn Gin Met Asn 
220 

<210> 4 
<211> 342 
<212> PRT 

<213> Bacillus halmapalus AA513 
<400> 4 

Met Lys Leu Leu Leu Lys Leu Thr Phe val Cys lie Phe Met Leu Ser 
-120 -115 -110 -105 

Gly lie Leu Ser Pro val Asn Ala Thr Gin Ala Glu Thr Leu Thr Lys 
-100 -95 -90 

Leu Asn Lys lie ser Gin Lys Gin Glu pro ser Tyr Lys Leu Asp Glu 
-85 -80 -75 

Glu Met Asp Tyr val Leu lie Asp Leu Glu Thr Gin ser Glu ser lie 
-70 -65 -60 

lie Ser lie Gly Asp Asn Thr Asp Leu Gly Asp Gin Ser Phe Thr Ser 
-55 -50 -45 

Leu Gly Lys Val Gly His Gly Glu Leu Glu Lys lie Asn Leu Glu Glu 

-40 -35 -30 -25 

Phe Arg Asn Pro Asn Leu Thr val Val Asp Pro Leu Thr Arg Lys Pro 
-20 -15 -10 

lie Glu Gin Lys He ser Pro Phe Val Val lie Gly Asp Asp Gly Arg 
-5 -11 5 

Arg Gin Val Gin Asn Thr ser Phe Met Pro Phe Arg Ala Leu Thr Tyr 
10 15 20 

lie Glu Phe Gly Asn Leu Thr ser Thr Trp Ser Cys Ser Gly Gly val 
25 30 35 40 

lie Gly Thr Asp Leu Val val Thr Asn Ala His Cys val Glu Gly ser 
45 50 55 

val Leu Ala Gly Thr Val val pro Gly Met Asn Asn ser Gin Trp Ala 
60 65 70 

Tyr Gly His Tyr Arg Val Thr Gin lie He Tyr Pro Asp Gin Tyr Arg 
75 80 85 
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Asn Asn Gly Ala Ser Glu Phe Asp Tyr Ala He Leu Arg val Ala Pro 
90 95 100 

Asp Ser Asp Gly Arg His lie Gly Asn Arg Ala Gly lie Leu ser Phe 
105 110 115 120 

Thr Glu Thr Gly Thr val Asn Glu Asn Thr phe Leu Arg Thr Tyr Gly 
125 130 135 

Tyr pro Gly Asp Lys He Ser Glu Thr Lys Leu lie ser Leu Trp Gly 
140 145 150 

Met val Gly Arg Ser Asp Ala Phe Leu His Arg Asp Leu Leu Phe Tyr 
155 160 165 

Asn Met Asp Thr Tyr Phe Gly Gin ser Gly ser Pro val Leu Asn Ser 
170 175 180 

val Asp ser Met val Ala val His Asn Ala Gly Tyr He val Gly Gly 
185 190 195 200 

Asn Arg Glu lie Asn Gly Gly Pro Lys He Arg Arg Asp Phe Thr Asn 
205 210 215 

Leu Phe Asn Gin Met Asn 
220 



<210> 5 
<211> 942 
<212> DNA 

<213> Bacillus licheniformis AC116 

<220> 

<221> CDS 

<222> (1)..(942) 

<220> 

<221> mat_peptide 
<222> (277). .(942) 



<220> 

<221> sig_peptide 
<222> (1). . (87) 

<223> pro-peptide (88) ... (276) 



<400> 5 

atg gcg aaa aat ggt gtt tea cgc gtt ttc att gec gga etc ate gga 48 
Met Ala Lys Asn Gly Val Ser Arg Val Phe lie Ala Gly Leu lie Gly 
-90 -85 -80 

att tct att ttt tct teg ggc att tac tct gca caa get gca tea teg 96 
He ser lie Phe ser Ser Gly lie Tyr Ser Ala Gin Ala Ala ser ser 
-75 -70 -65 

ccg cat ace cca gtc tec age gac cct teg tac aag ccc ggc tec ace 144 
Pro His Thr Pro Val Ser Ser Asp Pro Ser Tyr Lys Pro Gly ser Thr 
-60 -55 -50 -45 

tat gat ccc aac ata aaa att gac aat aac ggc gca tat teg aaa gee 192 
Tyr Asp Pro Asn lie Lys lie Asp Asn Asn Gly Ala Tyr Ser Lys Ala 
-40 -35 -30 

ttc gaa gga acc gga aca ccc ggc ggc tec gtt cag gec aaa ccg aaa 240 
Phe Glu Gly Thr Gly Thr Pro Gly Gly Ser val Gin Ala Lys Pro Lys 
-25 -20 -15 
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aaa gaa teg ccc gec ggc ccg cct tac age cct aaa teg gta ate ggc 288 
Lys Glu ser Pro Ala Gly Pro Pro Tyr ser Pro Lys Ser val lie Gly 
-10 -5 -11 



tea gat gaa egg aca agg gtg act gat aca acg gee ttt cca tac aga 336 
Ser Asp Glu Arg Thr Arg val Thr Asp Thr Thr Ala Phe Pro Tyr Arg 
5 10 15 20 



gca ate gtc cat att tea age age ate ggc tea tgc aca ggc tgg ctg 384 
Ala lie val His lie Ser Ser ser lie Gly Ser eys Thr Gly Trp Leu 
25 30 35 

ate gga ccg aaa acg gta gca acg gee ggg cac tgc gtc tat gac acg 432 
lie Gly Pro Lys Thr val Ala Thr Ala Gly His cys val Tyr Asp Thr 
40 45 50 

gca age cga tea ttc gcg gga acc gec ace gtt tec ccg gga cga aac 480 
Ala Ser Arg ser Phe Ala Gly Thr Ala Thr val ser Pro Gly Arg Asn 
55 60 65 

ggt tea get tac cct tac gga tct gtt aca teg acc cgc tat ttc ate 528 
Gly Ser Ala Tyr Pro Tyr Gly ser Val Thr Ser Thr Arg Tyr Phe lie 
70 75 80 

ccg teg ggt tgg cag age gga aat tec aat tat gac tac gca gcg ate 576 
Pro Ser Gly Trp Gin ser Gly Asn ser Asn Tyr Asp Tyr Ala Ala lie 
85 90 95 100 

gag etc age cag ccg ate ggc aat acc gtc gga tat ttc gga tat tea 624 
Glu Leu ser Gin Pro lie Gly Asn Thr val Gly Tyr Phe Gly Tyr ser 
105 110 115 

tac acc get tea teg ctt gca gga gca ggc gtg acc ate age gga tat 672 
Tyr Thr Ala Ser Ser Leu Ala Gly Ala Gly Val Thr lie Ser Gly Tyr 
120 125 130 

cca gga gac aaa aca aca ggc acc cag tgg caa atg tec gga acg ate 720 
Pro Gly Asp Lys Thr Thr Gly Thr Gin Trp Gin Met Ser Gly Thr lie 
13 5 140 145 

get gtt tea gaa acg tat aaa ctg caa tat gcg ate gac aca tac gga 768 
Ala val Ser Glu Thr Tyr Lys Leu Gin Tyr Ala lie Asp Thr Tyr Gly 
150 155 160 

ggt caa age ggt tec ccg gta tat gag aaa age agt tea agg aca aac 816 
Gly Gin ser Gly ser Pro val Tyr Glu Lys Ser ser ser Arg Thr Asn 
165 170 175 180 

tgc age ggc cca tgc teg ctg gee gtt cat acg aac ggc gtg tac gga 864 
Cys Ser Gly Pro Cys Ser Leu Ala val His Thr Asn Gly val Tyr Gly 
185 190 195 

gga tec tct tac aac aga ggc acc cgc att acg aaa gaa gta ttt gat 
Gly Ser Ser Tyr Asn Arg Gly Thr Arg lie Thr Lys Glu val Phe Asp 
200 205 210 

aat ttc aca age tgg aaa aac age gca cag 942 
Asn Phe Thr ser Trp Lys Asn ser Ala Gin 
215 220 

<210> 6 
<211> 314 
<212> PRT 

<213> Bacillus 1 i cheniformi s AC116 
<400> 6 

Met Ala Lys Asn Gly Val Ser Arg val Phe lie Ala Gly Leu lie Gly 
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-90 -85 -80 

lie ser lie Phe ser ser Gly lie Tyr ser Ala Gin Ala Ala ser Ser 
-75 -70 -65 

pro His Thr Pro val ser ser Asp pro Ser Tyr Lys Pro Gly Ser Thr 
-60 -55 -50 -45 

Tyr Asp Pro Asn lie Lys lie Asp Asn Asn Gly Ala Tyr Ser Lys Ala 
-40 -35 -30 

t 

Phe Glu Gly Thr Gly Thr Pro Gly Gly Ser val Gin Ala Lys Pro Lys 
-25 -20 -15 

Lys Glu Ser Pro Ala Gly Pro Pro Tyr Ser Pro Lys Ser val lie Gly 
-10 -5 -1 1 

Ser Asp Glu Arg Thr Arg val Thr Asp Thr Thr Ala Phe Pro Tyr Arg 
5 10 15 20 

Ala lie val His lie ser Ser ser lie Gly ser Cys Thr Gly Trp Leu 
25 30 35 

He Gly Pro Lys Thr val Ala Thr Ala Gly His cys Val Tyr Asp Thr 
40 45 50 

Ala Ser Arg Ser Phe Ala Gly Thr Ala Thr val ser Pro Gly Arg Asn 
55 60 65 

Gly ser Ala Tyr Pro Tyr Gly Ser Val Thr ser Thr Arg Tyr Phe lie 
70 75 80 

pro Ser Gly Trp Gin Ser Gly Asn Ser Asn Tyr Asp Tyr Ala Ala lie 
85 90 95 100 

Glu Leu Ser Gin Pro lie Gly Asn Thr Val Gly Tyr Phe Gly Tyr ser 
105 110 115 

Tyr Thr Ala ser ser Leu Ala Gly Ala Gly val Thr He Ser Gly Tyr 
120 125 130 

Pro Gly Asp Lys Thr Thr Gly Thr Gin Trp Gin Met ser Gly Thr lie 
135 140 145 

Ala val Ser Glu Thr Tyr Lys Leu Gin Tyr Ala lie Asp Thr Tyr Gly 
150 155 160 

Gly Gin Ser Gly Ser Pro val Tyr Glu Lys Ser ser Ser Arg Thr Asn 
165 170 175 180 

Cys Ser Gly Pro cys ser Leu Ala Val His Thr Asn Gly Val Tyr Gly 
185 190 195 

Gly Ser Ser Tyr Asn Arg Gly Thr Arg He Thr Lys Glu val Phe Asp 
200 205 210 

Asn phe Thr ser Trp Lys Asn Ser Ala Gin 
215 220 



<210> 7 
<211> 909 
<212> DNA 

<213> Bacillus pumilus B032 

<220> 
<221> CDS 
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<222> (1) . . (909) 
<220> 

<221> mat_peptide 
<222> (265).. (909) 

<220> 

<221> sig^peptide 
<222> (1)..(78) 

<223> pro-peptide (79) ... (264) 
<400> 7 

atg atg aaa aag gtg aaa atg tta etc cct tct eta ctt gtt ttt ggt 48 
Met Met Lys Lys Vat Lys Met Leu Leu Pro Ser Leu Leu val Phe GTy 
-85 -80 -75 

get tta agt gtg cct agt ttt gee cat gec gca tct gat tea gtg eta 96 
Ala Leu ser VaT Pro Ser Phe Ala His Ala Ala ser Asp ser vaT Leu 
-70 -65 -60 

acg tct gat tat gac atg gtg act tct gat gga aag gtg ate tct tea 144 
Thr ser Asp Tyr Asp Met VaT Thr Ser Asp Gly Lys val lie Ser ser 
-55 -50 -45 

agt gat ttc cac aat gat acg aaa tec ccc tea tec ttt gat aaa gtg 192 
ser Asp Phe His Asn Asp Thr Lys ser Pro ser Ser phe Asp Lys VaT 
-40 -35 -30 -25 

gat gat eta tct tea act gtt ggt gaa aaa gta aaa cca eta tea aaa 240 
Asp Asp Leu ser ser Thr val GTy Glu Lys val Lys Pro Leu ser Lys 
-20 -15 -10 

tat tta aaa gac ttt caa aea aaa gtc gtc att gga gac gat ggt aga 288 
Tyr Leu Lys Asp Phe Gin Thr Lys val val lie Gly Asp Asp Gly Arg 
-5 -11 5 

aca aaa gta gca aat aca aga gtg gca cca tat aat tea att get tat 336 
Thr Lys val Ala Asn Thr Arg val Ala Pro Tyr Asn ser lie Ala Tyr 
10 15 20 

act acg ttt ggc ggc tec age tgc acg ggg acc ctg att gee cct aac 384 
Thr Thr Phe Gly Gly ser ser Cys Thr Gly Thr Leu lie Ala Pro Asn 
25 30 35 40 

aaa att ttg aca aac gga cac tgc gtg tac aat aca gca tec aga agt 432 
Lys He Leu Thr Asn Gly His cys VaT Tyr Asn Thr Ala Ser Arg ser 
45 50 55 

tat agt gca aaa gga teg gtg tat cca ggc atg aat gat agt act gcg 480 
Tyr ser Ala Lys GTy Ser VaT Tyr Pro GTy Met Asn Asp ser Thr Ala 
60 65 70 

gtg aat ggc tea gca aat atg aca gag ttc tat gta cca age ggg tat 528 
VaT Asn GTy Ser Ala Asn Met Thr Glu Phe Tyr val Pro Ser GTy Tyr 
75 80 85 

ate aat aca ggt gcg age caa tat gat ttt gee gtg ate aaa aca gat 576 
lie Asn Thr GTy Ala ser Gin Tyr Asp Phe Ala vaT lie Lys Thr Asp 
90 95 100 

acg aac att ggc aat aca gtt ggt tac cgt tec ate cgt cag gtg aca 624 
Thr Asn lie GTy Asn Thr val GTy Tyr Arg ser He Arg Gin vaT Thr 
105 110 115 " 120 



aac tta act ggg aca acg att aaa att tct gga tat cca ggt gat aaa 
Asn Leu Thr GTy Thr Thr lie Lys lie ser cTy Tyr Pro GTy Asp Lys 
125 130 135 



672 



page 9 



10517 

atg aga tea act ggc aag ate teg cag tgg gag atg tea ggt cct gtg 720 
Met Arg Ser Thr Gly Lys lie ser Gin Trp Glu Met Ser Gly Pro Val 
140 145 150 

aca aga gaa gat acg aat etc gca tac tat atg att gat aca ttt agt 768 
Thr Arg Glu Asp Thr Asn Leu Ala Tyr Tyr Met lie Asp Thr Phe Ser 
155 160 165 

gga aat tea ggc tea gcg atg eta gat caa aat cag caa att gtt ggg 816 
Gly Asn Ser Gly Ser Ala Met Leu Asp Gin Asn Gin Gin lie val Gly 
170 175 180 

gtt cat aac gca ggg tat tea aac ggt acg att aat ggc ggt cca aaa 864 
val His Asn Ala Gly Tyr ser Asn Gly Thr lie Asn Gly Gly Pro Lys 
185 190 195 200 

gcg aca get gee ttt gtt gaa ttt ate aac tat gca aaa gcg caa 909 
Ala Thr Ala Ala Phe val Glu Phe lie Asn Tyr Ala Lys Ala Gin 
205 210 215 

<210> 8 
<211> 303 
<212> PRT 

<213> Bacillus pumilus B032 
<400> 8 

Met Met Lys Lys val Lys Met Leu Leu Pro Ser Leu Leu val Phe Gly 
-85 -80 -75 

Ala Leu Ser Val Pro Ser Phe Ala His Ala Ala Ser Asp Ser val Leu 
-70 -65 -60 

Thr ser Asp Tyr Asp Met val Thr ser Asp Gly Lys val lie ser ser 
-55 -50 -45 

ser Asp Phe His Asn Asp Thr Lys ser Pro ser ser Phe Asp Lys val 
-40 -35 -30 -25 

Asp Asp Leu ser ser Thr val Gly Glu Lys Val Lys Pro Leu ser Lys 
-20 -15 -10 

Tyr Leu Lys Asp Phe Gin Thr Lys Val Val lie Gly Asp Asp Gly Arg 
-5 -11 5 

Thr Lys val Ala Asn Thr Arg Val Ala Pro Tyr Asn Ser lie Ala Tyr 
10 15 20 

Thr Thr Phe Gly Gly Ser Ser Cys Thr Gly Thr Leu lie Ala Pro Asn 
25 30 35 40 

Lys He Leu Thr Asn Gly His Cys Val Tyr Asn Thr Ala ser Arg Ser 
45 50 55 

Tyr ser Ala Lys Gly ser val Tyr Pro Gly Met Asn Asp Ser Thr Ala 
60 65 70 

val Asn Gly ser Ala Asn Met Thr Glu Phe Tyr val pro ser Gly Tyr 
75 80 85 

lie Asn Thr Gly Ala Ser Gin Tyr Asp phe Ala Val lie Lys Thr Asp 
90 95 100 

Thr Asn lie Gly Asn Thr Val Gly Tyr Arg Ser lie Arg Gin val Thr 
105 110 y 115 120 

Asn Leu Thr Gly Thr Thr He Lys He ser Gly Tyr Pro Gly Asp Lys 
125 130 135 
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Met Arg Ser Thr Gly Lys lie ser Gin Trp Glu Met Ser Gly Pro val 
140 145 150 

Thr Arg Glu Asp Thr Asn Leu Ala Tyr Tyr Met lie Asp Thr Phe ser 
155 160 165 

Gly Asn ser Gly Ser Ala Met Leu Asp Gin Asn Gin Gin lie val Gly 
170 175 180 

val His Asn Ala Gly Tyr ser Asn Gly Thr lie Asn Gly Gly Pro Lys 
185 190 195 200 

Ala Thr Ala Ala Phe val Glu Phe Xle Asn Tyr Ala Lys Ala Gin 
205 210 215 



<210> 9 
<211> 954 
<212> DNA 

<213> Bacillus licheniformis CDJ 31 

<220> 

<221> CDS 

<222> CD..C954) 

<220> 

<221> mat_peptide 
<222> (289).. (954) 

<220> 

<221> sig_peptide 
<222> (15.. (84) 

<223> pro-peptide (85) . (288) 
<400> 9 

atg aaa aaa agt gtg aca cgc gta tta atg gcc ggt ctt att gga ata 48 
Met Lys Lys Ser Val Thr Arg val Leu Met Ala Gly Leu lie Gly lie 
-95 -90 -85 

tct att tat tct atg ggc ate gac tec get caa get gca tea teg ccg 96 

Ser lie Tyr Ser Met Gly lie Asp Ser Ala Gin Ala Ala Ser ser Pro 
-80 -7? -70 -65 

cat act cct gtc tct age gat cct tea tac aag ccc gac tea tec gca 144 
His Thr Pro val Ser Ser Asp Pro ser Tyr Lys Pro Asp Ser ser Ala 
-60 -55 -50 

age tat gat cct get att aaa ace aac aaa aac ggc gcc tat tea aaa 192 
Ser Tyr Asp Pro Ala lie Lys Thr Asn Lys Asn Gly Ala Tyr ser Lys 
-45 ~ -40 -35 

gca ttt gaa ggt aca gga aaa eta gac get ccc ctt tat cag gaa aaa 240 
Ala Phe Glu Gly Thr Gly Lys Leu Asp Ala Pro Leu Tyr Gin Glu Lys 
-30 -25 -20 

age aaa cca ace aaa aaa tec cct gcc gga cca cgt tac age ccc aaa 288 
Ser Lys Pro Thr Lys Lys ser Pro Ala Gly Pro Arg Tyr Ser Pro Lys 
-15 -10 -5 -1 

tec gtg att ggt tct gat gaa egg acg aga gtg aca aac act ace gca 336 
Ser val lie Gly Ser Asp Glu Arg Thr Arg Val Thr Asn Thr Thr Ala 

10 

tat cca tac aga gcg ate gtg cat att tea age age ate ggg tct tgc 384 
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Tyr Pro Tyr Arg Ala lie val His lie Ser ser Ser lie Gly ser Cys 
20 25 30 

acc ggc tec ctg ate ggt ccg aaa acg gtg gca acg gee gga cac tgc 432 
Thr Gly Ser Leu lie Gly Pro Lys Thr Val Ala Thr Ala Gly His cys 
35 40 45 

att tat gac aca gcg age ggg tea ttc gee gga acc get acc gtt tct 480 
lie Tyr Asp Thr Ala ser Gly ser Phe Ala Gly Thr Ala Thr val ser 
50 55 60 

ccg gga egg aac ggt tea aca tat ccg tac gga tea gtt aca tea acc 528 
Pro Gly Arg Asn Gly Ser Thr Tyr Pro Tyr Gly Ser Val Thr Ser Thr 
65 70 75 80 

cgc tat ttc ate ccg tea ggc tat cga age gga aat teg aat tac gac 576 
Arg Tyr Phe lie Pro Ser Gly Tyr Arg ser Gly Asn ser Asn Tyr Asp 
85 90 95 

tac gga gee ata gag etc age cag ccg ate ggc aac acc gtc ggg tat 624 
Tyr Gly Ala lie Glu Leu ser Gin Pro lie Gly Asn Thr val Gly Tyr 
100 105 110 

ttc gga tat tec tac acc acc teg tct etc gtt ggg tea age gtt acc 672 
Phe Gly Tyr Ser Tyr Thr Thr Ser Ser Leu Val Gly Ser ser Val Thr 
115 120 125 

ate ate gga tat cca ggc gac aaa aca teg ggc acc caa tgg cag atg 720 
lie lie Gly Tyr Pro Gly Asp Lys Thr Ser Gly Thr Gin Trp Gin Met 
130 135 140 

tec gga aat ate gec gtc tea gaa aca tat aaa ctg caa tat gcg ate 768 
ser Gly Asn lie Ala val ser Glu Thr Tyr Lys Leu Gin Tyr Ala lie 
145 150 155 160 

gac aca tac gga ggg cag age ggc tct ccc gta tat gag gcg age age 816 
Asp Thr Tyr Gly Gly Gin ser Gly ser Pro val Tyr Glu Ala Ser ser 
165 170 175 

tec aga acg aat tgc age ggc cca tgt teg ctg gee gtt cat acg aat 864 
ser Arg Thr Asn cys Ser Gly Pro cys Ser Leu Ala val His Thr Asn 
180 185 190 

999 gtg tac gga gga tct tea tac aac aga ggc acc egg att aca aaa 912 
Gly Val Tyr Gly Gly Ser ser Tyr Asn Arg Gly Thr Arg lie Thr Lys 
195 200 ~ 205 

gaa gta ttc gat aat ttg aca aac tgg aaa aac age gec caa 954 
Glu val Phe Asp Asn Leu Thr Asn Trp Lys Asn ser Ala Gin 
210 215 220 

<210> 10 
<211> 318 
<212> PRT 

<213> Bacillus licheniformis CD331 
<400> 10 

Met Lys Lys Ser Val Thr Arg Val Leu Met Ala Gly Leu lie Gly lie 
-95 -90 -85 

Ser lie Tyr Ser Met Gly lie Asp Ser Ala Gin Ala Ala ser Ser Pro 
-80 -75 -70 -65 

His Thr Pro val ser Ser Asp Pro Ser Tyr Lys pro Asp ser Ser Ala 
-60 -55 -50 

Ser Tyr Asp Pro Ala lie Lys Thr Asn Lys Asn Gly Ala Tyr Ser Lys 
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-45 -40 -35 

Ala Phe Glu Gly Thr Gly Lys Leu Asp Ala Pro Leu Tyr Gin Glu Lys 
-30 -25 -20 

ser Lys Pro Thr Lys Lys ser Pro Ala Gly Pro Arg Tyr ser Pro Lys 
-15 -10 -5 -1 

Ser val lie Gly Ser Asp Glu Arg Thr Arg Val Thr Asn Thr Thr Ala 
1 5 ~ 10 15 

Tyr pro Tyr Arg Ala lie val His lie ser ser ser lie Gly ser cys 
20 25 30 

Thr Gly ser Leu He Gly pro Lys Thr val Ala Thr Ala Gly His cys 
35 40 45 

lie Tyr Asp Thr Ala ser Gly ser phe Ala Gly Thr Ala Thr Val ser 
50 55 60 

Pro Gly Arg Asn Gly Ser Thr Tyr Pro Tyr Gly ser Val Thr Ser Thr 
65 70 75 80 

Arg Tyr Phe lie Pro Ser Gly Tyr Arg Ser Gly Asn Ser Asn Tyr Asp 
85 ^ 90 95 

Tyr Gly Ala lie Glu Leu ser Gin Pro lie Gly Asn Thr val Gly Tyr 
100 105 110 

Phe Gly Tyr ser Tyr Thr Thr ser ser Leu val Gly ser ser val Thr 
115 120 125 

lie lie Gly Tyr Pro Gly Asp Lys Thr ser Gly Thr Gin Trp Gin Met 
130 135 140 

ser Gly Asn lie Ala Val Ser Glu Thr Tyr Lys Leu Gin Tyr Ala lie 
145 150 155 * 160 

Asp Thr Tyr Gly Gly Gin ser Gly Ser Pro Val Tyr Glu Ala Ser Ser 
165 170 175 

ser Arg Thr Asn Cys Ser Gly Pro Cys Ser Leu Ala val His Thr Asn 
180 185 190 

Gly Val Tyr Gly Gly Ser Ser Tyr Asn Arg Gly Thr Arg lie Thr Lys 
195 200 205 

Glu val Phe Asp Asn Leu Thr Asn Trp Lys Asn Ser Ala Gin 
210 215 220 



<210> 11 
<211> 906 
<212> DNA 

<213> Bacillus pumilus 0A96 

<220> 

<221> CDS 

<222> CD.- (906) 

<220> 

<221> mat_peptide 
<222> C262)..C906) 

<220> 

<221> sig_peptide 
<222> CI) --(75) 

Page 13 



10517 

<223> pro-peptide (76) ... (261) 
<400> 11 

atg aaa aag gtg aaa aaa tta ate cct tct eta etc gtt ttt ggt get 48 
Met Lys Lys Val Lys Lys Leu lie Pro ser Leu Leu Val Phe Gly Ala 
-85 -80 ^75 

tta agt gtg cct agt ttt gee cat gca gca tct gat tea gta ctt acg 96 
Leu ser val Pro Ser Phe Ala His Ala Ala Ser Asp ser val Leu Thr 
-70 -65 -60 

tct gat tat gac atg gtg act tct gac gga aag gtg att tct tea get 144 
ser Asp Tyr Asp Met val Thr Ser Asp Gly Lys val lie ser Ser Ala 
-55 -50 -45 -40 

gac ttc cac aac gat atg aaa acc ccc tea tec ttt gac aaa gtg gat 192 
Asp Phe His Asn Asp Met Lys Thr Pro ser ser Phe Asp Lys val Asp 
-35 -30 -25 

gat etc tct tct act att ggc gaa aaa gta aaa cca etc aca aca tat 240 
Asp Leu Ser Ser Thr lie Gly Glu Lys Val Lys Pro Leu Thr Thr Tyr 
-20 -15 -10 

tta aaa gac ttt caa aca aaa gta gtc att gga gac gat ggt aga aca 288 
Leu Lys Asp Phe Gin Thr Lys Val Val lie Gly Asp Asp Gly Arg Thr 

aaa gtg acg aat aca aga gta gca ccc tat aat tct att get tat att 336 
Lys val Thr Asn Thr Arg Val Ala Pro Tyr Asn Ser lie Ala Tyr He 
10 15 20 25 

aca ttt ggt gga tct age tgc act gga aca etc att get cca aac aaa 384 
Thr Phe Gly Gly Ser Ser cys Thr Gly Thr Leu lie Ala Pro Asn Lys 
30 35 40 

ata ttg aca aac gga cac tgc gtc tac aat aca gee aca aga agt tat 432 
lie Leu Thr Asn Gly His Cys Val Tyr Asn Thr Ala Thr Arg ser Tyr 
45 50 55 

agt gca aaa ggg tct gtc tac cca ggc atg aat gac age acg get gtg 480 
ser Ala Lys Gly ser val Tyr Pro Gly Met Asn Asp ser Thr Ala val 
60 65 70 

aac ggc tea gca aac atg acc gaa ttc tat gta cca age gga tat ate 528 
Asn Gly Ser Ala Asn Met Thr Glu Phe Tyr val Pro Ser Gly Tyr lie 
75 80 85 

aac acg ggg gcg agt caa tat gat ttt gee gtc att aaa aca gat acg 576 
Asn Thr Gly Ala ser Gin Tyr Asp Phe Ala val lie Lys Thr Asp Thr 
90 95 100 105 

aac att gga aat acg gtc ggc tat cgc tct att cgt caa gtg aca aat 624 
Asn lie Gly Asn Thr val Gly Tyr Arg ser lie Arg Gin val Thr Asn 
110 115 120 

eta aca ggt aca acg att aaa att tct gga tat cca ggt gat aaa atg 672 
Leu Thr Gly Thr Thr lie Lys lie Ser Gly Tyr pro Gly Asp Lys Met 
125 130 135 

aga teg act ggc aaa gtg tea caa tgg gaa atg tea ggt cca gtc acg 720 
Arg ser Thr Gly Lys Val ser Gin Trp Glu Met Ser Gly Pro val Thr 
140 145 150 

aga gaa gat acg aat etc gca tac tat acg ate gat aca ttt age gga 768 
Arg Glu Asp Thr Asn Leu Ala Tyr Tyr Thr lie Asp Thr Phe ser Gly 
155 160 165 

aac tct ggc tct gcg atg eta gat cag aac caa caa ate gtc ggg gtc 816 
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Asn Ser Gly ser Ala Met Leu Asp Gin Asn Gin Gin lie val Gly Val 
170 175 180 185 

cat aat gcg gqt tat tea aat gga acg ate aac ggt gga cca aaa gcg 864 
His Asn Ala Giy Tyr Ser Asn Gly Thr lie Asn Gly Gly Pro Lys Ala 
190 195 200 

act get gec ttt gtt gaa ttt ate aac tat gcg aag gcg caa 906 
Thr Ala Ala Phe Val Glu Phe lie Asn Tyr Ala Lys Ala Gin 
205 210 215 

<210> 12 
<211> 302 
<212> PRT 

<213> Bacillus pumilus JA96 
<400> 12 

Met Lys Lys Val Lys Lys Leu lie Pro ser Leu Leu val Phe Gly Ala 
-85 -80 -75 

Leu ser Val Pro ser Phe Ala His Ala Ala ser Asp Ser val Leu Thr 
-70 -65 -60 

Ser Asp Tyr Asp Met val Thr Ser Asp Gly Lys Val lie ser ser Ala 
-55 -50 -45 -40 

Asp Phe His Asn Asp Met Lys Thr Pro ser ser Phe Asp Lys val Asp 
-35 -30 -25 

Asp Leu ser ser Thr lie Gly Glu Lys val Lys Pro Leu Thr Thr Tyr 
-20 -15 -10 

Leu Lys Asp Phe Gin Thr Lys Val Val lie Gly Asp Asp Gly Arg Thr 
-5 -11 5 

Lys val Thr Asn Thr Arg val Ala Pro Tyr Asn Ser lie Ala Tyr lie 
10 15 20 25 

Thr Phe Gly Gly ser ser Cys Thr Gly Thr Leu lie Ala Pro Asn Lys 
30 35 40 

lie Leu Thr Asn Gly His Cys Val Tyr Asn Thr Ala Thr Arg ser Tyr 
45 50 55 

Ser Ala Lys Gly Ser val Tyr Pro Gly Met Asn Asp Ser Thr Ala val 
60 65 70 

Asn Gly ser Ala Asn Met Thr Glu Phe Tyr val Pro ser Gly Tyr lie 
75 80 85 

Asn Thr Gly Ala Ser Gin Tyr Asp Phe Ala val lie Lys Thr Asp Thr 
90 95 100 105 

Asn lie Gly Asn Thr val Gly Tyr Arg Ser lie Arg Gin val Thr Asn 
110 115 120 

Leu Thr Gly Thr Thr lie Lys lie Ser Gly Tyr pro Gly Asp Lys Met 
125 130 135 

Arg ser Thr Gly Lys val ser Gin Trp Glu Met Ser Gly pro val Thr 
140 145 150 

Arg Glu Asp Thr Asn Leu Ala Tyr Tyr Thr lie Asp Thr Phe ser Gly 
155 160 165 

Asn ser Gly ser Ala Met Leu Asp Gin Asn Gin Gin lie val Gly Val 
170 175 180 185 
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His Asn Ala Gly Tyr ser Asn Gly Thr He Asn Gly Gly pro Lys Ala 
iyu 195 200 

Thr Ala Ala Phe Val Glu Phe He Asn Tyr Ala Lys Ala Gin 
205 210 215 

<210> 13 
<211> 939 
<212> DNA 

<213> Bacillus subtilis IS75 

<220> 

<221> CDS 

<222> CI).. (939) 

<220> 

<221> mat_peptide 
<222> (280).. (939) 

<220> 

<221> siapeptide 
<222> (15..(102) 

<223> pro-peptide (103) ... (279) 
<400> 13 

atg aaa tta gtt cca aga ttc aga aaa caa tgg ttc act tac tta sea as 
Met Lys Leu val Pro Arg Phe Arg Lys Gin T?p Phe Ala Ty? l2 Thr 
~ yu -85 -80 

8? 12 m £3 JK S3 IS 2S XB «B S fl* & g* fi» * 

~ /:> -70 -65 

s: « jk su as ss as as s «h s as s? §?; st as 144 

IS SJ IS «S SS as £ k «; © $ « ^ i 92 

35 -30 

SS 5? S?3 ft 3? I?? f» «? IS »» S a «c g ? c g? c caa 2 40 

~ 25 -20 _if 

i aa . ctg gaa aaa aac att caa acc tta caq cct tea aar att atr ?rs 
Thr Glu Leu Glu Lys Asn He Gin Thr Leu Gin Pro Ser ler lie ?le 
" xu -5 -1 1 

gga act gat gaa cgc acc aga ate tec age acq aca tct tt* cm i-ai- *m 
Gty Thr Asp Glu Arg Thr A?g He Ser sir Thr Thr ler Phe Pro T?r 6 

Ara a? 3 Thr 8Si Si* . Ctg tca a ? c aag tat ccc aac a « tea age act 384 
Arg Ala Thr val Gin Leu ser lie Lys Tyr Pro Asn Thr ser sir Thr 
u 2 * 30 35 

tSJ g?5 rSc ?h C 2? a S* tta gtc aat cca aat a ca gtc gtc acg get 432 
Tyr Gly cys Thr Gly Phe Leu val Asn Pro Asn Thr val val Thr Ala 

40 45 50 

« ss $ a? a* is si a? a? js s s? jr as 480 

" 60 65 
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ai5 ?? 9 S CQ ggc cgc aat 99* tc 9 tea tat°ccg tac qqt act tat tea 5?8 
Ala Ala Pro Gly Arg Asn cfy Ser Ser Tyr Pro Tyr ifj Th? Tyr tlr 
/u 75 gQ 

as f ss jk g g ?|f l" a; g its ss rs jk ss 576 



95 



aac tat gat tac gga get att aaa tta aac qqt tct cct aaa aar am ro/l 
Asn Tyr Asp Tyr Jy Ala He Lys Leu Asn &} ffi pS if? SS fh? " 4 

105 110 H5 

<£} 5? c S" tac gqc tac c 99 act aca aac age age aqt ccc ota aac 672 
val cty Trp Tyr Gly Tyr Arg Thr Thr Asn Ser sir ser Pro Va? g?5 
120 125 130 

LeJ ££2 Iff. ?P 5? a t J c cca tgt Qac aaa a cc ttt ggc acg 720 
Leu ser ser ser vaT Thr Gly Phe pro cys Asp Lys Thr Phe Gly Thr 

140 145 

S3 g HI XS g S8 gg III 35 g JK §?2 » g ffl 768 

g gf g ?g as g gs as g §?s ii? g SS 83 g 816 
g g a as g g g? «j i?? ?s jR si - 5? c s? « 8 6< 

ou 185 190 195 

m k g is k as a ^ & «. - g* ? « 912 

^00 205 210 

aac aat att caa tat tgg gca aat caa 939 

Asn Asn lie Gin Tyr Trp Ala Asn Gin 
215 220 

<210> 14 
<211> 313 
<212> prt 

<213> Bacillus subtil is IS75 
<400> 14 

Met Lys Leu val Pro Arg phe Arg Lys Gin Trp Phe Ala Tyr Leu Thr 
~ yu -85 



-80 



val Leu cys Leu Ala Leu Ala Ala Ala Val Ser Phe Gly val Pro Ala 
~ /;> -70 -65 

Lys Ala Ala Glu Asn Pro Gin Thr ser val ser Asn Thr Gly Lys Glu 
~ b ° -55 -50 

Ala Asp Ala Thr Lys Asn Gin Thr Ser Lys Ala Asp Gin val ser Ala 

-40 -35 _3o 

Pro Tyr Glu Gly Thr Gly Lys Thr Ser Lys Ser Leu Tyr Gly Gly Gin 

~ 25 -20 -15 

Thr Glu Leu Glu Lys Asn He Gin Thr Leu Gin Pro Ser Ser He lie 
-10 -5 -i i 

Gly Thr Asp Glu Arg Thr Arg lie Ser ser Thr Thr ser Phe Pro Tyr 
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5 10517 

5 10 15 

Arg Ala Thr val Gin Leu ser He Lys Tyr Pro Asn Thr ser ser Thr 

25 30 35 

Tyr Gly cys Thr Gly Phe Leu val Asn Pro Asn Thr val Val Thr Ala 
40 45 50 

Gly His cys val Tyr Ser Gin Asp His Gly Trp Ala Ser Thr lie Thr 
" 60 65 

Ala Ala Pro Gly Arg Asn Gly ser ser Tyr Pro Tyr Gly Thr Tyr ser 
/u 75 80 

Gly Thr Met Phe Tyr ser val Lys Gly Trp Thr Glu Ser Lys Asp Thr 

Asn Tyr Asp Tyr Gly Ala lie Lys Leu Asn Gly ser Pro Gly Asn Thr 

105 HO 125 

val Gly Trp Tyr Gly Tyr Arg Thr Thr Asn Ser Ser Ser Pro Val Gly 
x ^ u 125 130 

Leu ser ser ser val Thr Gly Phe Pro cys Asp Lys Thr Phe Gly Thr 
135 WO 145 

Met Trp ser Asp Thr Lys Pro lie Arg ser Ala Glu Thr Tyr Lys Leu 
x:>u 155 160 

Thr Tyr Thr Thr Asp Thr Tyr Gly cys Gin ser Gly ser Pro val Tyr 

Arg Asn Tyr Ser Asp Thr Gly Gin Thr Ala He Ala lie His Thr Asn 

185 190 195 

Gly Gly ser ser Tyr Asn Leu Gly Thr Arg val Thr Asn Asp val Phe 
* w 205 210 

Asn Asn lie Gin Tyr Trp Ala Asn Gin 
215 220 



<210> 15 
<211> 909 
<212> DNA 

<213> Bacillus intermedius 

<220> 

<221> CDS 

<222> (1).. (909) 

<220> 

<221> mat_peptide 
<222> C265)..(909) 

<220> 

<221> siapeptide 
<222> (1)..C78) 

<223> pro-peptide C79) ... (264) 
<400> 15 

Met Met tti <£? f aa S tg tta ctc cct tct cta etc gtt ttt ggt 48 

Met Met Lys Lys Val Lys Met Leu Leu Pro Ser Leu Leu Val Phe Gly 
-»5 -80 -75 

get tta agt gtg cct agt ttt gec cat gec aca teg gat tea gta cta 96 
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Ala Leu sen val Pro Ser Phe Ala His Ala°Thr ser Asp ser val Leu 



-60 



ss ss? as s? as ss raj «? sss as as rs ss? ss is? ss 144 

" -50 _ 45 

si? fen Phf S?^ aat f at 2E 9 aaa tcc ccc tc * tec ttt gac aaa gtq 192 
ser Asp Phe hts Asn Asp Thr Lys ser Pro ser ser Phe Asp Lys VaT 

-30 -25 

SS SS SS 2? £ B? S IS l?S K 31? SS SS 25 SS SS 240 

"20 -15 -10 

S? 2! SS SSS ffi S?S ?S? SJ SS? S3 SS 1?? JBS Bp IS SS 288 

" -1 1 5 

?5? S a S SS? JBS SS ?S SS 83 S?t SS 5? as IS? IS IS ss 336 

v 15 20 

ss ?s? sss is as ss? us sss ?s? is ?s? as ss s?s ss as 384 

iU 35 40 

K SS SSS ffi? as §S SS SSS S3 S? SS ?S? JH SS? SS IIS 432 

* 50 55 

ss lis s?s sa is is? ss? ss ss as ss as as us ?s i?i 480 

ou 65 70 

S3 as is ss? ss sss as? ?ss i?s sss ss ss? ss tas is ss 528 

/3 80 85 

ss as ?s? is si ss? gs s? as sss s?s S3 ss sa ?s? as 576 

fSf sss ss is sss ss? is ss ss sss ss sss s?s SS? ffi? 624 

±±v 115 120 

sss sss ss is ?s? fs ss ?a ss sss is ss ss is as ss 672 

130 135 

ss? ss sss jS? IS SSS S3 ss? s?s s? BS SS SS? HJ SSS S3 720 

X4U 145 150 

?s? s?i as as ?s? as sss k ss ss ss ss as ?s? sss sss 768 

x:>:> 160 165 

is as ss? is ss? a 3 ?! ss sss sss s?s as s?s s?s ss ss? is 816 

±/u 175 180 J 

f|| SB sss s?s is ss ss? sss is ?ss ss sss is is ss sss 864 

iyu 195 200 

gcg aca get gec ttt gtt gaa ttt ate aac tat gca aaa gcg caa 909 
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Ala Thr Ala Ala Phe val Glu Phe He Asn°Tyr Ala Lys Ala Gin 



210 215 



<210> 16 
<211> 303 
<212> prt 

<213> Bacillus intermedius 
<400> 16 

Met Met Lys Lys Val Lys Met Leu Leu Pro Ser Leu Leu Val Phe Gly 

Ala Leu ser val Pro Ser Phe Ala His Ala Thr ser Asp ser Val Leu 

Thr ser Asp Tyr Asp Met val Thr ser Asp Gly Lys val He ser ser 

ju -45 

Ser Asp Phe His Asn Asp Thr Lys ser Pro Ser ser Phe Asp Lys val 

-30 -25 
Asp Asp Leu ser ser Thr Ser Gly Glu Lys val Lys Pro Leu ser Lys 
^ -15 _ 10 

Tyr Leu Lys Asp Phe Gin Thr Lys val val He Gly Asp Asp Gly Arg 

j. x 5 

Thr Lys val Ala Asn Thr Arg val Ala Pro Tyr Asn ser He Ala Tyr 

20 

lie Thr Phe Gly Gly ser Ser cys Thr Gly Thr Leu lie Ala Pro Asn 

3U 35 40 

Lys He Leu Thr Asn Gly His cys val Tyr Asn Thr Ala Ser Arg Ser 
^ J 50 55 

Tyr ser Ala Lys Gly ser Val Tyr Pro Gly Met Asn Asp Ser Thr Ala 



70 



val Asn Gly ser Ala Asn Met Thr Glu Phe Tyr val Pro ser Gly Tyr 

80 85 

He Asn Thr Gly Ala Ser Gin Tyr Asp Phe Ala val He Lys Thr Asp 

yi 100 
Thr Asn He Gly Asn Thr Val Gly Tyr Arg ser He Arg Gin Val Thr 

liU 115 120 

Asn Leu Thr Gly Thr Thr He Lys He Ser Gly Tyr Pro Gly Asp Lys 

130 £35 

Met Arg ser Thr Gly Lys val Ser Gin Trp Glu Met ser Gly ser val 

Thr Arg Glu Asp Thr Asn Leu Ala Tyr Tyr Thr lie Asp Thr Phe ser 

Gly Asn ser Gly ser Ala Met Leu Asp Gin Asn Gin Gin He Val Gly 

x/5 180 " r 

V|l His Asn Ala Gly Tyr ser Asn Gly Thr lie Asn Gly Gly Pro Lys 

195 200 
Ala Thr Ala Ala Phe val Glu Phe lie Asn Tyr Ala Lys Ala Gin 
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<210> 17 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 17 

ctgtgccctt taaccgcaca gc 22 



<210> 18 
<211> 24 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 
<400> IS 

gcataagctt ttacaggtac cggc 24 
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